AI Factor - Recreating Linear Model Predictions

Of course P123 cannot train on (or even validate with) future data. As far information leakage from validation data in the past, P123 is excellent about providing gaps for times series validation and for the k-fold validation. Including an embargo period after the validation data.

Providing an embargo is a fine point that is often missed.

Doing validation correctly is hard and P123 makes it easy. You almost can't do it wrong even if you want to,

P123 may normalize both the training data and the validation data at the same time. Before proceeding with the training and testing steps. Marco considers the pros and cons of this here:

@AlgoMan had the same consideration—presumably for his downloads or API.

Both Marco Algoman are considering the pros and cons of normalizing the entire data set at once (training and validation sets together) vs normalizing just the training data and using the same mean and standard deviation to normalize the validation set.

Its both thorough and advanced on their parts to consider this. They both argue that it probably does not make any meaningful difference which way it is done.

Thank you for sharing your results @AlgoMan!