Great news: fast simulations and ensembles coming soon!

Dear All,

Big thanks @bobmc for steering us in the right direction with his comments. We were going down the wrong path to support backtests using a "point in time" predictor so that no training data is used for predicting. But it's completely unnecessary since we do not need to do any inference during a backtest. The inference was already done during validation. It simply was not stored.

By storing the validation predictions we achieve several things:

  1. Backtests will be super fast. We just suck in the validation predictions for the stock on a particular date.
  2. Ensemble models will be super easy within an AIFactor. The ensembles will show up in the Reports and be readily comparable to single models.

There will be some limitations:

  1. Predictors will only be allowed for current data (latest and most recent weekend). Anything else will generate an error. It's better to eliminate any confusion: predictors are for the future. Nothing more.
  2. Backtests will have to line up exactly with the AIFactor validation. For example if your AI Factor was validated using 4 week frequency, you will not be able to run a weekly backtest. And the dates will have to line up.

Let us know your thoughts.

4 Likes

Just to clarify:

Let's say we're looking to predict 1 month future returns. We set the rebalance frequency in the AI factor dataset 'period' to weekly, because if we did monthly, there would be a small fraction of the data available for training.

Sounds like we'd then not able to backtest on a monthly basis. Wouldn't this be particularly disadvantageous to non-weekly rebalancers, given we'd no longer be able to take advantage of training on the full weekly dataset?

Also, with the dates having to line up, does this mean that every time we're looking to run a new backtest in the future, we'd have to create a new AI factor entirely? E.g. today is the 20th July. If we want to run an updated backtest in 2 months (20th September), we'd have to create a new AI factor with the updated dataset period, and revalidate all of the models to the current date?

We should be able to make that work, where sim frequency is a multiple and greater than the ai validation frequency

Yes. but with only with the models you have zeroed in on, not "all of them".

1 Like

I was playing arround and using different fundamantals as targets, like inceased earnings comming 3months, and I'm getting quite interesting results. However, to run more than 2 AI factors in one rank seems to be too much and I assume the solution you have for backtesting will solve this.

What is the timeline to be able to use the stored validation data when backtesting ranks and simulations?

If your algorithm is simple enough, or you have few enough features, you can run more AI factors. But I agree that good (or minimal) model combination requires much faster simulations.

Next week, or week after at the latest. Some people are on vacation.

1 Like