Data leakage?

Kurtis,

I think you are right. Maybe I am missing a fine point myself but they are pretty much the same thing as you suggest.

The k-fold cross-validation has the purpose of finding the best model using something pretty similar to a rank performance test. But you cannot make a screen or Sim out of that information for now.

You can make a Sim or screen by making an AI factor which requires "Predict" for now.

But I think you are 100% right. This last is just another way to do cross-validation and the only differences are practical—especially if you want to make a Sim out of the data.

Pitmaster has made your same point in a different way (as have I). The k-fold validation data could be made into a screen. Pitmaster asked for a set number of stocks in a post which would act like a screen. I think you could also use that data for a Sim.

So I think you are 100% correct in your impression. They serve the same purpose. But you cannot make a screen or sim out of the k-fold validation method for now and that is just a practical difference.

Here is an example of making a sim using predict: First impressions with new Beta - #6

Hope that helps some.

Jim

1 Like