Based on recent forum discussions (linked below), it seems important that P123 allow for Feature Z-Score Normalization by Date and not force the user to Z-Score Normalize over the entire dataset. Reasons for this are as follows
1.) Normalization over the entire dataset introduces some data leakage into the training set. For example, an earnings growth rate of 50% in 2003 might be the largest ever seen up to that date, but the algorithm would give it a lower score because of earnings growth of 200% in 2009 influencing the mean and standard deviation used for normalization.
2.) Discrepancies in normalization between Training, Test, and Live Data. Are the same mean and standard deviation being used for normalization in all three cases? If my model is fit based on data normalized with a particular mean and standard deviation, what happens to its predictions when that mean and standard deviation change?
3.) Disconnect between normalizing the Target Variable by Date, while normalizing the Features over the entire Dataset.
4.) Inability to download the actual normalized feature set used to fit the AI Factor model. This limits the user's ability to test and understand their model before putting it into a Live Strategy.
Some time ago I did a few test to see if the Normalization of the entire data set did effect the "out of sample" results if the out of sample period was used in the Normalization perdiod too.
The first test is done traing a model on the whole data set, 2003-2025. Then train a predictor between 2003-2020.
Next run train only on the data set between 2003-2020. Then train the predictor on the data between 2003-2020.
Now compare the out of sample results 2021-2025 between the two. I have not seen any significant differences in the few test I have performed.
Going to close this request, because I'm an idiot. Looks like you can set normalization by Date per Feature in the Step2 Normalization drop-down menu. Thanks to P123 staff for pointing this out.
I wonder if P123 could share a solution with the rest of the forum (in addition to any direct messaging) next time a member requests a feature that is already available. If someone as sophisticated as Daniel has a question about something like this in the future, I am sure he is not the only one who would benefit.