AI Factor - Allow User to Enter Custom, Non-Contiguous Date Ranges For Validation & Prediction

dnevin123 · September 22, 2025, 7:45pm

Gents,

Wanted to suggest a feature for AI Factor. It would be very useful if we could enter Custom Non-Contiguous Date Ranges in both the Validation and Prediction portions of AI Factor. A couple of reasons for this:

1.) Divide Financial History by Market Environments

One could argue that the current methods are naive. It doesn’t allow the user to break up financial history by market environments. Easiest example of this is the GFC. For certain strategies (especially MicroCap focused strategies) returns can be dominated by performance during the GFC. It would be nice to able to break-up Validation and Prediction time periods taking this into account. Maybe I want to exclude the GFC entirely, or I want to make sure just one of my Validation folds contains the GFC time period, etc.

2.) Create Ensemble Models

One thing that I worry about with the current system is that the Prediction model tends to be trained on the full time period, while the Validation model is trained on 5 different folds. I think you could make a case that taking the average rank from the 5 Validation models would be more robust than taking the single rank from the Prediction model trained over the whole time period (the influence of the GFC discussed above is a potential cause of this). By allowing Custom Non-Contiguous Date Ranges in the Prediction section I can generate a Predictor for each fold and then average them in a Ranking System.

Probably some other use cases for this feature, but I find these two to be compelling.

Cheers,

Daniel

marco · September 23, 2025, 2:49am

You can do this now. You have to enter the date by hand vs. picking a fold, but it is possible. You can specify the training period for a predictor when you add them. The MAX represents the entire loaded dataset.

AlgoMan · September 23, 2025, 10:08am

I support that idea.

I have noticed that AI models often focus way too much on the return on the bounce after a crash. I guess it is natural that the models will do that since the biggest opportunities are in those periods. However, I would like to be able to try to train models blocking out those periods to see if it would give a less volatile return over the full cycles.

dnevin123 · September 23, 2025, 11:17am

Thanks Marco, I guess the only distinction would be the ability to enter non-contiguous date ranges similar to how the K-fold method works. One way to implement this would be to allow users to specify ranges of dates to exclude from the training set for a given Predictor.

-Daniel

marco · September 23, 2025, 12:16pm

I see. It's much easier to offer the ability to exclude periods in the predictor that in the validation. It's just a mask. The validation runs backtests, so there's a lot more logic. Also validation data can be used for portfolio backtests.

Thanks

SZ · September 23, 2025, 6:34pm

I had this idea this summer when I first heard of some of these models. Basically I wanted to not use 1999-2003 data in the training but have the ability to backtest a model using these skipped years. Would that be possible?