AI Factors - ETA SOON! (beta release)

We are building a platform module that will allow users to train and deploy best-in-class machine learning models to produce more intelligent alpha signals from fundamentals and estimates available within Portfolio123.

The predictions from trained models will be available as custom factors across all P123 tools, so that you can use them in the entire portfolio lifecycle.

For example, you will be able to train a regression tree model from stock factors and formulas of your choice, and use it to produce an array of 6 months expected excess returns on any universe and deploy it in a ranking system.


Proof of concept
Architecture and infrastructure ready
Functional Prototype
Full integration in P123
Alpha deployment

Beta deployment (available to beta testers)
Production deployment

How do I get access to the Beta site? I would like to test the new fonctionality

Now that’s exciting!

Shooting for the middle of March. We decided to take a little longer to make a better first impression.


I too would like to get access to the Beta Site for testing new functionality.

It would be really great to get some kind of support video when this tool comes out. It seems fearly advanced, and to avoid that, it is only used by people who are used to ML a video or webinar will help a lot.

Since it has been in beta for some time, is it possible to publish how machine learning could, for instance, improve the Core Combination? This will show some of the usefulness of this tool.


This is a great suggestion.
Take something we all have and are familiar with (a ranking system) and show us how this can be used to create a set of features and a target variable in the ML system, and (I assume) improve the RS.

Thanks for all the work that went into this new feature. Now you P123 guys can up-sell me GPU credits to run the ML stuff in the cloud, and I will happily pay for it :slight_smile:

1 Like

We’re not sure yet we’ll need a beta site. It’s a completely new component and does not affect the rest of the site.

As far as support videos we will have a webinar. Yes, it did feel incredibly advanced at first, not any more. Sure, the guts are very sophisticated ML algorithms, but from high above it’s just another way to score stocks. We’re trying very hard to make it as intuitive as possible. We went through several iterations (mistakes) but we’re feeling bullish.

Another goal was to seriously start focusing our tools on robustness rather than straight out 20 year CAGR. You will soon see when we release a redesigned Rank Performance tool. The same approach is used for the AI factors, and will be carried over to future redesigns.

Regarding how to use it in conjunction with “classic P123”, that remains to be seen and explored. We do not know. You will be able to add AI factors inside ranking systems by themselves or in conjunction with other linearly ranked factors. You’ll also be able to use them in formulas. Should be interesting.


1 Like

deleted by author

Thanks for the update, the more you describe the direction your ML project is taking and the user interface you are working on the better it looks and the more I’m looking forward to it.
Like many others here I’ve played around with selecting factors, doing historical factor data downloads, splitting the data into train test, running a few ML programs . . . Random Forest, SVM. Then realizing how much additional software required to select the buy/sell decisions on stocks and keep track of everything. Although there some of us capable of doing this on our own it requires a significant amount of effort and the result is not likely to be as capable as a full time P123 team can comes up with.

Will we be able to target stock alpha, volatility, or alternative metrics using these new tools? Currently, we’re limited to relying solely on total rate of return. We require more consistent metrics for our analysis, imo.


I cannot get ML to work unless I use excess returns relative to the universe. Full stop. I assume this filters out the random market noise. And de Prado often talks about detoning and denoising. A whole chapter is dedicated to these two terms in this reference: Machine Learning for Asset Managers. Although he makes it harder than it has to be—often using matrices in his discussions and proofs.

Maybe others have a different experience. Maybe this will be included in the P123 AI/ML beta. For the present downloads one needs to use Python to get this target—which is not hard. Maybe we do not need it as a target in the downloads, but only because we have the ability to manipulate the data once it is downloaded.

FWIW assuming you want feedback and forum participation at some point (and it is not too early for that now). It may be a target in the beta release as far as I know. Maybe your experience suggests we do not need it. Other people’s experience may be different than mine and I am happy to learn about a better target suggested by P123 when targets are discussed in the forum. Also you might consider testing it before the beta release if you have not already. Still a lot of unknowns for members and maybe too early to participate as it is all guesses for members now as to what it will look like. My apologies if my speculation/participation is in error or just too early. I cannot get it to work without this myself which seemed salient to the discussion at some point, however.

Use-case: I might be able to train a random forest model that has been shown to work with the downloads (using cross-validation) and rebalance easily using the same hyper parameters for this rebalance with the new AI/ML I would need the same target to be able to do that. And I am pretty sure I will not be the only one to find excess returns relative to the universe (without cap-weighting) useful as a target.

There would be no question of other people wanting this if de Prado became a member although he might find the matrix route more intuitive. Also, he might want to approach this from the standpoint of making the data stationary which he seems to think is important for some reason. Either way, I think there are some more complex arguments for the use of excess returns relative to the universe as a target other than simply noticing nothing else seems to work. While present members may not care, the Kaggle crowd and others with machine learning or statistical training are sure to notice from a marketing perspective.:slightly_smiling_face:

Addendum: If I was too wonky, here is how de Prado puts it in the above reference (regarding detoning):

“….by removing the market component, we allow a greater portion of the correlation to be explained by components that affect specific subsets of the securities. It is similar to removing a loud tone that prevents us from hearing other sounds.

de Prado, Marcos López . Machine Learning for Asset Managers (Elements in Quantitative Finance) (p. 31). Cambridge University Press. Kindle Edition.


What do you mean “currently we’re limited to returns” ? Do you mean within the Factor Download tool? You can do any target you want: volatility, sales growth, estimate… Anything you can write as a formula can be a target. We didn’t put examples, maybe that’s the problem.

There are two ways to do any targets in the Factor Download tool (I’ll use 1Mo volatility as example):

1) Lag it yourself
Simply add a feature like TRSD30D. When you download the data shift non target columns down 4 weeks so that your features for a particular date line up with the future TRSD30D 4 weeks out. (note that TRSD30D uses 21 bars so there’s a little bit of leakage since 4 weeks is max 20 bars)

2) Use FHist with a negative offset
If you don’t want to shift columns just add your target as FHist(“TRSD30D”,-4) which calculates the volatility 4 weeks into the future.

We’ll add more examples in the Targets for the Factor Download that should help you get started. Let us know any specific ones.


It is my understanding that the recently introduced download tool serves as a preliminary measure for the forthcoming AI tools. Am I mistaken in this interpretation? My inquiry pertains to the potential alignment between the launch of AI tools on this platform and their capacity to effectively target user objectives. While Jrinne commented on excess returns, I personally favor focusing on alpha. Additionally, I anticipate that some individuals may seek to standardize returns by factoring in standard deviation, specifically by evaluating alpha relative to the standard deviation of alpha. Will this be provided?

@marco ?