Big thanks @bobmc for steering us in the right direction with his comments. We were going down the wrong path to support backtests using a "point in time" predictor so that no training data is used for predicting. But it's completely unnecessary since we do not need to do any inference during a backtest. The inference was already done during validation. It simply was not stored.
By storing the validation predictions we achieve several things:
Backtests will be super fast. We just suck in the validation predictions for the stock on a particular date.
Ensemble models will be super easy within an AIFactor. The ensembles will show up in the Reports and be readily comparable to single models.
There will be some limitations:
Predictors will only be allowed for current data (latest and most recent weekend). Anything else will generate an error. It's better to eliminate any confusion: predictors are for the future. Nothing more.
Backtests will have to line up exactly with the AIFactor validation. For example if your AI Factor was validated using 4 week frequency, you will not be able to run a weekly backtest. And the dates will have to line up.
Let's say we're looking to predict 1 month future returns. We set the rebalance frequency in the AI factor dataset 'period' to weekly, because if we did monthly, there would be a small fraction of the data available for training.
Sounds like we'd then not able to backtest on a monthly basis. Wouldn't this be particularly disadvantageous to non-weekly rebalancers, given we'd no longer be able to take advantage of training on the full weekly dataset?
Also, with the dates having to line up, does this mean that every time we're looking to run a new backtest in the future, we'd have to create a new AI factor entirely? E.g. today is the 20th July. If we want to run an updated backtest in 2 months (20th September), we'd have to create a new AI factor with the updated dataset period, and revalidate all of the models to the current date?
I was playing arround and using different fundamantals as targets, like inceased earnings comming 3months, and I'm getting quite interesting results. However, to run more than 2 AI factors in one rank seems to be too much and I assume the solution you have for backtesting will solve this.
What is the timeline to be able to use the stored validation data when backtesting ranks and simulations?
If your algorithm is simple enough, or you have few enough features, you can run more AI factors. But I agree that good (or minimal) model combination requires much faster simulations.
There’s an issue with AI Factor not aligning with a 1-year targets, which results in a mismatch. The stats page shows the results but the performance section has "weekly" rebalance if that your observational period, despite looking 1 year out. I understand this was done to account for the lack of sampling but this isn't the right way to handle it: it's producing incompatible results.
Also, I suggest allowing 1-year performance screens to have rolling options—monthly, weekly, or quarterly—to build a portfolio that holds for a full year to account for the noise (i.e. it take a year to fully diversify your signal if you are targeting 1 year). I believe this would thread the needle for the issue above.
Additionally, could the performance screener for long-short portfolios be beta or volatility neutral? For example, if the longs have twice the beta of the shorts, rebalancing would weight the shorts at half. This is needed with long short solutions, which AI is producing (in my experience).
Finally, with the rolling screener backtest as the main simulation engine for the new AI feature for long holding periods, it would be helpful to include beta, alpha, and related statistics with each rolling porfolio. The other stats are helpful but not enough anymore.
Lots of things here, I know, but the AI factor still feels less transparent than other P123 tools and believe these will help in many ways with little time invested.