What does normalization and rank mean?

I have downloaded the factors with these settings:

What does it mean that the numbers for each factor for each date are ranked and normalized? I understand that they are given a value from 0 to 1:

But does this mean that "1" is the highest value a factor can receive? For example, if a ticker receives a "1" for "subindustry momentum," does that mean that on that date, the subindustry of this stock had one of the highest performances compared to other industries?

And what about "share turnover 3 months," where the goal is to have as low numbers as possible? Is "0" the lowest value, or is it still "1" that is considered the best?

Here are a screenshot of the spreadsheet:

I think the data is normalized through ranking.From the documentation:

  • Rank: Values are sorted in descending order, then assigned the percentile rank from 1.0 to 0. The percentile step is calculated excluding NAs.

So yes, 1.0 is the highest value, always corresponding to the highest underlying factor value. This means that for lower-is-better things like turnover, 0.0 is "best". I guess you could fix this by adding minus to the factor, if you always want 1.0 to be optimal.

A small difference in Download factors vs ordinary ranking systems, is that NAs are always set to 0.5 here. In ranking systems they are instead set to the median of the non-NA values.

Edit: (sorta wishing P123 would add an option that would make download factors equal to the ranking system method)

1 Like