Would time-series validation be hard for P123?


So of the 2 choices you proposed Pitmaster and I like #2. You said #1 would be limited in the ability to cross-validate.

I do not think it would be hard for a member to do some cross-validation with option #1. Maybe manually at first. But also to automate the models that run quickly.

Example of manual time-series validation by a P123 member with option #1. Train with option #1 for 2000- 2010. Run the model for 2010 - 2011 (a validation or test sample). Record the result in an Excel spreadsheet for 2010 - 2011. Then train 2000 - 2011. Test that model for the year 2011 - 2012, record……Train 2000 - 2021 and test 2021 - 2022, record. Average the recored spreadsheet results.

Arguably (okay not even debatable really) more automated and less use of a spreadsheet than what many are doing now with the optimizer.

Arguably (okay not so arguable again) less P123 processor-time than what is being used now at P123 with the optimizer.

Plus, I suspect you could automate that. AND you could even automate a time-series validation mimicking what people do now with the optimizer.

But you might have to be selective on the runtime of some of the models. You would have to streamline the use of the optimizer LIKE WITH BAYESIAN OPTIMIZATION THAT @jlittleton INTRODUCED TO THE FORUM. @pitmaster has has some good ideas on machine learing models with very short runtimes. If you want to consider time-series validation I suggest you talk to him about the initial models.

TL;DR: Are you making this harder than it has to be?