Do AI Models Use Raw Absolute Values or Relative Ranks?

marco · January 2, 2025, 9:15pm

Our AI Factors use normalized features for training and inference; either as percentile ranks or z-scores. If you want to incorporate magnitude use z-score. We trim outliers to 2.5σ but you can go as high as 10. Also, with the latest AI Factor upgrades, you can control each feature normalization (it also allows you to skip normalization altogether).

Our "classic" ranking systems are based on percentile ranks with a choice on how to rank NAs. We're planning to revisit the classic ranking systems and formally add z-score as an option for ranking.

Regardless of where they are used, there are a pluses and minuses to each scaling method. We're working on tools to analyze each method. But for now it's mostly done empirically.

Interesting idea. This is not currently possible with ranking systems. With AI factor, you can retrain the predictor which changes the weights of each features. BTW, you can empirically find the ideal retraining period by running cross validations with different number of folds.

With AI Factor you can normalize using the entire dataset. This should be used carefully as it can introduce look-ahead, but for stationary factors like PE that are range bound (they are, aren't they?) normalizing against the dataset might give you the desired effect.

We're working on a tool to calculate alpha for individual factors en masse, and give you, for example, the top 20 factors that are least correlated.