Some time ago I did a few test to see if the Normalization of the entire data set did effect the "out of sample" results if the out of sample period was used in the Normalization perdiod too.
The first test is done traing a model on the whole data set, 2003-2025. Then train a predictor between 2003-2020.
Next run train only on the data set between 2003-2020. Then train the predictor on the data between 2003-2020.
Now compare the out of sample results 2021-2025 between the two. I have not seen any significant differences in the few test I have performed.