Ranking vs machine-learning algorithms

By the way this is what I understood from the substack post from Andreas:

Setting Large & Mid-Cap Stocks Small-Cap Stocks
Architecture ZScore-based only:
1. ZScore + Date (Global) with Skip and Date at the feature level.
2. ZScore + Dataset (Global) with Skip and Dataset at the feature level (considered best for long-term robustness).
A combination of both:
1. ZScore + Date (Global) with Skip and Date at the feature level.
2. Rank + Date (Global) with Skip and Date at the feature level.
Rationale: Builds relative, dynamic rules that adapt to rapidly changing market structures. "Rank + Date" is too static and loses its robustness in these efficient markets. Rationale: The two systems are complementary; they select different stocks, increasing the portfolio's capacity and diversification. "Rank + Date" works here because inefficiencies are more stable and can be captured with absolute rules.
Target Variable 9- to 12-Month Total Return / Relative Return.
(Note: 6 months might be fine for Mid-Caps, but was not tested by the author).
3-Month Total Return / Relative Return.
Rationale: Longer horizons smooth out short-term noise and align with how institutional capital rotates in more efficient markets. Rationale: Signals decay faster in noisy small-cap markets. Shorter horizons are ideal for capturing strong, short-term mispricings.
ML Algorithm ExtraTrees LightGBM + ExtraTrees (used to provide complementary signals).
Rationale: Stable, low-variance, and performs well with the "cleaner" data typical of large-cap stocks. Rationale: They serve different purposes: LightGBM is an "alpha extractor" (best for concentrated portfolios), while ExtraTrees is a "rank stabilizer" (best for broader portfolios of 30-100+ stocks).
Number of Features Sweet Spot: 87–180 Sweet Spot: 87–180
Rationale: A carefully selected set of features is crucial. The author emphasizes building on top of well-tested factors rather than reinventing the wheel. Rationale: A broad, curated set of features is necessary to capture the various drivers in the more complex and noisy small-cap universe.
Outlier Limit Always 5 Always 5 (for the ZScore architecture).
Rationale: The author emphasizes that the default value of 2.5 is too low and will remove too much valuable information. Rationale: Same as for Large Cap; the default value of 2.5 is too low.
Retraining Less sensitive to retraining. Rank + Date requires regular retraining (every 12–18 months).
Rationale: The ZScore approach is relative and has a "longer shelf life" across market regimes. Rationale: The absolute, static rules in "Rank + Date" must be refreshed to remain relevant. The ZScore system is more robust over time.
2 Likes

Yuval,

After thinking about this and posting some answers that were probably in the weeds I have a short answer to your question.

For context, some of the things i used to do to weight features in P123 classic’s ranks did not really need my domain knowledge or any knowledge at all really.

This would include radomizing weights in a spreadsheet and putting those weights into the optimizer. I am not sure how people are doing that now days but this used to be a popular method with P123.

Personally, I am happy to let Python do some or all of that. And sometimes, at least, I carry that to the point that I have turned P123 classic into a machine learning program. For me P123 ‘s classic is machine learning. Particularly when I use a Python program to optimize the weights.

For me the line between P123 classic and machine learning became pretty blurry about 10 years ago when InspectorSector uploaded a spreadsheet that would do some of the randomization of feature weights. His algorithm then copied the features and pasted them into the optimizer. The rank performance test then essentially became a selection function in an evolutionary algorithm. Much of this was done manually at the time. I guess it still is by some.

Maybe not yet machine learning at that point-in-time due to the the need for manual operation and the need for the user to look at the rank weights that performed best and re-enter them into InspectorSector’s spreadsheet. But I was happy to turn that over to Python and let Python do that while I slept.

When I do it that way it is machine learning by any definition.

But I am still using P123 classic in the process. I would never criticize P123 classic or doing some of the same things I do with Python by hand either.

@Pitmaster further automated something similar to InspectorSector’s method here: Genetic algorithm to replace manual optimizaton in P123 classic - #41 by pitmaster . And the original post in that thread was about a Genetic algorithm that also fully automates the selection of weights in the ranking system.

Marco has proposed an entirely different way to automate weighting of the features in ranking system here: AI Factor - Designer Model? - #15 by marco

This is now becoming common-place and common knowledge among many members including Marco: P123 can be made into a machine learning model. The origins of this extending back a decade or so to when InspectorSector shared his spreadsheet.

So to directly answer you invitation to give arguments for machine learning I have 3:

  • it works
  • it saves me time
  • I can’t see any degradation in performance compared to when i used a spreadsheet or manual trial and error to optimize a ranking system nearly a decade ago.

I think a lot of people are doing something like that and just like to characterize it according to their affinity for the term machine learning. I don’t care what people call that as long as I have the tools to optimize the ranks weights using Python myself.

And that can be done in a nearly endless number of ways with or without manual steps in the pipeline. I have a program–with a method that new to me, but part of a Python library–doing that for me now, in fact.

1 Like

Thanks for your thoughts, they're well expressed, and I have no argument with them. For the purposes of this discussion I was curious about the advantages that the Machine Learning models that P123 introduced about a year ago might have over ranking systems in terms of a) the way they choose stocks to buy and sell; and b) how well they can fit into my workflow. I wanted to present the advantages ranking systems had and was hoping to read some advantages that the ML models had as a counterpoint. It wasn't meant to be a discussion about the merits and demerits of machine learning in general, and I'm sorry if it turned into one.

1 Like

Thanks Yuval and to be completely honest (my interpretation alone) I see many more similarities than differences in our methods.

Admittedly, I have a tendency to see the same math in different methods and I think the math is pretty universal no matter the details of the methods.

I am such a math geek!

Thank you for getting me to think about this and finally articulate it in a samewhat understandable way. And for the great ideas you have presented in the forum over the years. I use a lot of year features in my models, for example.

And perhaps there is still a question to be asked: do we need tabular neural-nets or should we focus more on expanding P123 classic? I think that may be part of your question. If it is I don’t have a clear answer to it.

Thank you Yuval.