AI Factor launches tonight. Beta program ends at midnight

pfrommert · September 5, 2024, 3:15pm

I am not deep into the AI system (and maybe this is a stupid idea), but if i understand the problem its mainly because macro factors have more impact on industry/sector selection than on stock selection itself and that macro factors are long term in nature?
Solution can be to create a industry/sector target that can be trained on these macro factors, so that you know which sector works best in which macro environment.
That can than be merged as a linear factor in the ranking system for stock selection. In fact it could be added into any ranking system.
I think Yuval has done something similar but very static with a factor for certain sectors/industries.

averplanck · September 5, 2024, 3:21pm

On the macro modeling topic, I believe for tree-based algorithms you could use raw or normalized values across the entire training set just don't normalize by date. Using interest rates as an example: They will be the same for every stock on a particular date but when the training data is pooled across time, there will be variability in interest rates allowing a tree-based model to make regime-based splits. Please correct me if I am wrong here, I have not tried it myself.

Jrinne · September 5, 2024, 3:30pm

Correct. Any monotonic transformation of all of the data for a feature makes no difference with tree-based models.

Everything I have ever read on (or done with) neural-nets just does a z-score on every feature (they would not normalize by date again).

Z-score would work on linear models wouldn't it? The equation for slope basically use z-score for raw data and does not hurt if you do it first.

Claude 3: " The equation for slope in simple linear regression does indeed effectively use z-scores internally, as you noted…….One small addition: While z-score standardization is generally safe and useful for linear models, there are cases where other types of scaling might be preferred, such as when dealing with features that have heavy-tailed distributions."

I.E. maybe you would do something like a log-transformation for some features using linear models on a case-by-case basis. I think neural-nets could handle fat tails and skew.

Claude 3 on neural-nets: Your point about neural networks being able to handle fat tails and skew is also correct: Neural networks, especially deep ones, can indeed learn to handle non-linear relationships and complex distributions, including those with fat tails or skew.

And with neural-nets, recognize cats (none of the data being normally distributed and no transformation that will make it so). Neural-nets can handle difficult data distributions is my point.

ZGWZ · September 5, 2024, 4:58pm

It is very hard to get the benefits from macro factors. But you can try it.

pfrommert · September 5, 2024, 5:04pm

After thinking about it, probably a lot of the factors that drive macro returns are already available in Industry Momentum.

marco · September 5, 2024, 5:12pm

We've been discussing this... "Macro AI factor" that either returns predictions for each industry/sector , or a single prediction for the market.

yuvaltaylor · September 5, 2024, 5:28pm

Not really. I don't use any macro factors at all. I do have a node in my ranking systems that ranks subsectors according to how strongly they react to my factors in general. That way biopharma gets almost completely ignored by my ranking systems. I do agree that it might be nice to try an ML approach to creating an

industry/sector target that can be trained on these macro factors, so that you know which sector works best in which macro environment.

Jrinne · September 5, 2024, 5:34pm

K-nearest neighbors will pick up both momentum and mean-reversion (it is non-linear). If you don't throw too many factors into the mix it works, marginally. Logit regression (for market direction), not so much in my experience. Using returns of sectors as features and future returns as targets. I have not tried macro factors like interest rates.

KNN is a simple program with one hyperparameter. So might be worth a try. But if it doesn't work, I think you will need a different set of factors. KNN is near optimal if you do not use too many features.

Claude 3: "Near-optimality with few features: Your statement about KNN being near-optimal with few features is generally correct, especially for problems where local patterns in the feature space are informative."

ZGWZ · September 5, 2024, 5:51pm

It seems like a categorical variable.

Indeed many models like to recommend biopharma stocks for some reason.

marco · September 5, 2024, 8:25pm

This we never considered . But you can try this now using the predictions stored during validation. For example to calculate the average of the most current 3 predictions do the following:

Run a validation
Make sure you click 'Save Predictions'
Copy the formula when validation is done (the clipboard icon)
Enter the following in the screener

@avg:FHistAvg(`AIFactorValidation("AI factor name", "model")`, 3, 4)

The formula above will store the average of last three predictions in the @avg variable, and you will see the values when you run the screen.

The '4' is the 'weeks_increment' parameter of FHistAvg. I used 4 because the model I used used a dataset with 4 week frequency.

This "trick" will not work for current predictions using AIFactor().

If this proves useful and improves performance we'll add much simpler ways, and support it for predictions.

averplanck · September 5, 2024, 8:31pm

Thank you! I have been using rank tolerance in the backtesting engine as an approximation, but I will try this as well.

smian · September 5, 2024, 10:26pm

$100/mo for the predictor seems a steep. Can we discount this to $50/mo?

WalterW · September 5, 2024, 11:56pm

After reading the comments, billing is even more confusing to me.

The AI Factor $50 credit seems like a true (monthly?) credit w/ billing starting after the credit is consumed.

The Predictor charge seems unrelated to the number of predictors used. Is that right?

While I think the resource pricing model has a lot of problems, perhaps a better way to bill for Predictors is to have screener/simulation/strategy resource cost double if they use a predictor.

Currently, a Live Strategy costs 10 resource unit. If it uses a component that instantiates a Predictor, the cost rises to 20 units.

Pick how you want to up-charge - 2x, 3x, 5x, etc.

That way everyone would have access to AI w/o upfront charges.

AlgoMan · September 6, 2024, 12:36am

So, when I have trained for an acumulated sum of $50 or more i get billed.
I if I train for $2 a day for 25 days in a row, I will get billed $50 on at midningt of day 25. After I got billed, a new $50 credit starts. That sounds fair.

If I chose not to activate the Prediction function, I can still train and validate AI factors if I have Userlevel Ultimate? And I can see the rankings on the Prediction page with no other extra costs? If I want to use the AI factors in the standard P123 screener, rankings and strategies, only then I would need to sign up for the $100 add-on?

All in all, if I got it right, I don't think the pricing is too unfair if I don't need the add-on while only training. I can do a whole lot of training for $20 now, I kind of know what works.
However, you might want to consider offering a discount for newbies. It takes a lot of experimentation to figure out where to focus, and it can be discouraging if the only outcome is bills piling up while experimenting.

danp · September 6, 2024, 2:19am

Here is a Knowledge base article which explains the pricing. Hopefully this clarifies everything but please let me know if anything is unclear. We will also be updating the Ai Users Guide soon to include all the recent changes.

Sorry we didnt provide this earlier to avoid the confusion.

korr123 · September 6, 2024, 2:24am

Could you provide the total cost incurred for using the feature during the beta period? It would be helpful to break it down by each AI Factor module, as this would give us a better understanding of the costs moving forward, especially as we refined our usage during the beta phase.

WalterW · September 6, 2024, 4:06am

From the Knowledge Base article - "When you enable AI Factor in your account, you will receive a $50 credit to use for worker training time."

So the $50 credit is a one-time deal?

Jrinne · September 6, 2024, 8:37am

I do not think the price for training—as I understand it now—is unreasonable. Prediction: not so much. In any case, P123 has done a good job and I would like to find a use for it (for training at least).

As I see it, P123 is competing against itself. We can download data and use our own computer or we can use the AI/ML at not too high of a cost for training.

P123 could consider leveraging the advantages of AI/ML to encourage greater usage OF BOTH METHODS.

Each method (API or downloads vs AI/ML) has its advantages.

Advantage API:

Complete control of the Python code allowing transformations of the data (recent hot topic with @marco and others suggesting it is vitally important for Macro predictions), different modules, libraries etc. Classification, early stopping that works…. The list is endless.

Sklearn is open source with a lot of brilliant people working on the code for a very long time. As long as access to that is restricted there will be something an advanced member would like to use that is not available with P123.

Data manipulation including excess returns relative to the universe which I guess no one else finds important. People seem to like excess returns to cap-weighted benchmarks for some reason. Really one example of control without having to make a feature request that may or may not be implemented at some date in the future.

Advantage AI/ML:

The main advantage here is instant access to any and all features. Full stop

I.e., I think the main advantage for the AI/MLwill be in the form of feature selection. Forever.

Once the features are selected either works (AI vs API). API having the advantage for advanced users who want more control and more options. This will never change as long as full access to Sklearn is not possible.

Maybe I am wrong but perhaps making feature selection as a priority might benefit everyone. Or all least it is something worth discussion.

Something only jrinne would say. In game theory (taught in most elite business schools) and economic theory a Nash Equilibrium is often the best solution. Basically one finds the pricing where he or she is INDIFFERENT to what the other person does. They don't care.

So the owner says: "I got better things to worry about than what my customer/competition decides to do. I profit either way"

Usually the Nash Equilibrium makes BOTH PARTIES indifferent.to what the other person does (Business owner and customer). And BTW, I am told there is always a Nash equilibrium.

So the customer simply says: "I use both."

As far a pricing advice, P123 could consider finding a "Nash equilibrium" where P123 and the member is indifferent to which solution they use (API or AI).

Caveats:

Sunk cost should not matter: The goal is to maximize future revenue.
It is harder to calculate a Nash equilibrium if there is something that can be done to grow the revenue from both options.
Each customer will have a different Nash equilibrium

I am not claiming to know the optimal pricing solution. That is for P123 to decide. That is one of the fun things of owning a business and I would not want to interfere with that

Probably not so useful, I know. But Nash had a point (won the Nobel Prize for it) and "A Beautiful Mind" was a not-so-bad movie

Maybe a bit of a stretch but here is what Claude 3 has to say: "This is actually a good scenario for applying Nash equilibrium concepts…....

Jim

taofen · September 6, 2024, 8:43am

Thanks P123 Team, it's great to see this mile stone that we have the AI launched officially.

The 100$/mon may be expensive for most of us before we prove it can provide better result than the traditional in live run. Maybe we can consider the below cost model:

WalterW suggested to use Resource Unit to charge user, that may not bring enough income to P123 to break even if user manage to only consume their RU to use AI , maybe user can just buy Predictor Instance/Reference, let say one Predictor Instance(Reference)/mon cost 10$ or less/more, this should be more fair for users because some users may use more predictors some less , let user pay more if they use more.

The above cost model will need more development work.
Thanks

Taofen

ZGWZ · September 6, 2024, 10:17am

For the first time I know pricing is for fun and not just for profit. That's probably why I heard about a billionaire who personally sets the price of every product even in his 90s.