Another thing to keep in mind is that if you are doing k-fold cross validation, preprocessing @ each fold might be overkill. Using the entire dataset to calculate distribution stats might be totally fine.
In fact I’d even venture to say that it’s preferable to have as complete a picture of the volatility of a factor as possible. Even if you are using the “forbidden” data reserved for validation.