Marco,
I will leave it for @yuvaltaylor to comment but i think he has consistently been for longer training periods. Note I don’t think using the optimizer (or not) is a big factor in whether we call this machine learning, reinforcement learning, jrinne [or fill in with the username of your choice] “special unique proprietary method not to be confused with machine learing.” Anyway, I leave it to Yuval to comment on his own system and I apologize for any misunderstandings on my part.
In my case I can comment without equivocation. I tried a rolling 10-year window training period and a training period that started 2000 each time. i.e, 2000 - 2010 train then test 2010 - 2011. Then train 2000 - 2011 and test 2011.
Which of the methods won was NOT EVEN CLOSE. Starting the training at year 2000 won hands down. That is what cross-validation if for: selecting the best method to use out of sample. For me, for this model starting at 2000 was best no matter what anyone’s opinion on the matter is.
Here are my results BTW: 2023 performance - #10 by Jrinne.
FWIW, one anecdotal example of a real-live port with P123 data if we are to have just one way of doing cross-validation at P123.
Me, I would be happy to be able to rebalance daily with our present downloads without dowloading data starting at 2000 each time. Daily for rank, z-score and Min/Max. I would pay the API fee for each abbreviated download for rebalance and I would download more complete data (starting with 24 years of Min/Max).
Ultimately, I would like to have the choice. Maybe not have it left up to someone else.
I DO appreciate what I can do now (see link above). Just my feedback on the present downloads which were described as a method of getting feedback from members that are doing machine learning already…
I welcome feedback from Pitmaster, Johnpaul etc. It would be my preference that they did not have to even read my posts if they did not find them helpful (and could do it their own way).
Jim