Add Support for LightGBM’s XENDCG Objective (Better Ranking)

Hey everyone,

Wanted to throw out a feature request that I think could really level up the AI Factors platform — specifically, adding support for LightGBM’s rank_xendcg objective.

Here’s the quick rundown:

Why Bother?

Most of us are used to optimizing stock return models using RMSE or similar regression losses. But there’s some pretty convincing research showing that ranking objectives actually perform better when you’re building systematic stock selection strategies.

Poh et al. (2020) showed that using LambdaMART (an implementation of LambdaRank) can improve out-of-sample Sharpe ratios by 3x compared to standard RMSE-based models. Pretty big jump. (Check out Exhibit 2 in their paper.)
Bruch (2021) took it a step further, introducing XENDCG (Expected NDCG) — it’s a listwise ranking loss that not only performs better but trains faster and more stable than LambdaRank.

Given that cross-sectional ranking (not point predictions) is the bread and butter of a lot of strategies here, I think this could benefit everyone.


What’s Blocking It Right Now?

When I try to set:

"objective": "rank_xendcg"

LightGBM throws this error:

LightGBMError: Ranking tasks require query information

Looks like P123 isn’t passing the group/query info that LightGBM needs for ranking objectives to work.


The Ask:

Could we get:

Support for rank_xendcg in LightGBM
Proper handling of the query/group data that LightGBM needs for ranking tasks

Why It’s Worth It:
It’s a cleaner fit for cross-sectional long-short models.
Faster, more stable, better ranking — backed by solid research.
Opens up a lot of cool opportunities for building systematic strategies focused on rank rather than raw returns.

Would love to hear thoughts from others — has anyone else tried using ranking objectives elsewhere? Curious if anyone’s worked around this limitation or if the dev team has thoughts on feasibility.

References:
Poh, Daniel, et al. Building cross-sectional systematic strategies by learning to rank. arXiv preprint arXiv:2012.07149 (2020).
Bruch, Sebastian. An alternative cross entropy loss for learning-to-rank. Proceedings of the Web Conference 2021.

Thanks all!
Henry

9 Likes

Hi, I've just rejoined P123 after many (10?) years away. What brought me here was the recent introduction of the AI functionality. I'm just getting started with P123 but i have had some success prior to joining P123 with using xgboost with just price series data ( momentum and volatility factors ) with the ranking objective rank:ndcg which approximates the LambdaMART. I think the LGBM method is similar so i would like to see this added to either xgboost or LGBM.
Thanks!

3 Likes

AI (ChatGPT) supports this feature request with a +1 :+1:, see redacted quote below:

Below are peer-reviewed papers, theses, and vendor documentation that directly back the claim that learning-to-rank (LtR) objectives such as lambdarank / rank_xendcg outperform ordinary regression losses for cross-sectional stock picking.

# Reference (chronological) Key finding relevant to cross-sectional equity ranking
1 Poh, D., Lim, B., Zohren, S., & Roberts, S. (2021). Building cross-sectional systematic strategies by learning to rank. Journal of Financial Data Science, 3(2), 70–86. doi.org Using LambdaMART–style list-wise LtR models on daily cross-sectional momentum signals “boosts Sharpe Ratios by ~3× relative to sorting regression or classification outputs.”
2 Poh, D. W. C. (2023). Constructing cross-sectional trading strategies: A machine-learning approach to learning to rank (Doctoral dissertation, University of Oxford). University of Oxford. robots.ox.ac.uk Across 20 years of global equities, list-wise LtR models “consistently rank the top decile more accurately than gradient-boosted regression, leading to materially higher realised returns.”
3 Kouloumpris, E., Moutsianas, K., & Vlahavas, I. (2024). SABER: Stochastic-aware bootstrap ensemble ranking for portfolio management. Expert Systems with Applications, 249, 123637. doi.org Presents an uncertainty-aware LtR ensemble that outperforms linear-regression and standard LtR baselines on ranking accuracy, delivers higher long-short returns, and lowers downside risk.
4 Quantitativo. (2025, March 1). Learning to rank enhances cross-sectional strategies. Quant Trading Rules. quantitativo.com Independent replication of (1): LambdaMART “tripled the strategy Sharpe vs. classic cross-sectional momentum built by regression.”
5 LightGBM Developers. (2025). lightgbm.LGBMRanker (Version 4.6) [Computer software documentation]. lightgbm.readthedocs.io States that ranking objectives are “mainly for training and applying ranking models,” distinguishing them from LGBMRegressor; the library ships dedicated list-wise losses (lambdarank, rank_xendcg) specifically for ordered-list problems like stock selection.

Take-away :rocket:

These sources demonstrate—both theoretically and in large-scale empirical tests—that pair-wise / list-wise LtR losses align directly with the goal of ordering stocks each day, and empirically deliver higher top-bucket alpha and Sharpe than training the same booster with a plain regression objective.

That is why, once Portfolio123 exposes a “Query/Group” field, switching the JSON "objective" to "lambdarank" (or "rank_xendcg") is expected to outperform any regression-type stock-picking tasks.

2 Likes

Thanks for this thread.

Just when I think I’m getting to know ML I rediscover what a small fraction I actually know!

As one YouTube presenter stated: “The ML Problem You've Probably Never Heard Of

regression and classification ... but have you heard of this? Learning to Rank.

I especially didn’t think this could apply to predicting the best stock to purchase but I was probably wrong.

It sort of took me a while to get my head around what they are actually doing and I’m still coming to grips

with the concept. However, it does seem to solve one of the flaws I have observed with regression and tree

solutions. They do a very good generalized prediction of typical stock performance but at least for me don’t

pick up any of the out of the ordinary situations which are infrequent but profitable.

Although this technique isn’t an obvious method to use for stock selection there are enough comments and

papers on the better results for stocks that it looks worth pesuing.

One Thesis from believe it or not Khalifa University of United Arab Emirates. It is technical price indicator based, completely different than the primary fundamental approach we are using. But the thesis had an nice introductory overview of the general approach, flow diagrams and even some example code.

file

1 Like

Further evidence out.

Guessing Rank> guessing returns in this example

3 Likes

Previous discussions in this thread focused on gradient boosting methods for rank ordering. This paper focuses on Ordered Logit Regression. This may or may not be a great fit with P123’s historical use of ranks and ordering. I have not tried it. But it has the advantage of transparency and lower computational costs compared to many ML methods.

In fact, Gemini thinks the computational requirements are such that… Well, here is the entire quote:

Cost: As we discussed, you can run this on a standard laptop in seconds, whereas training a Listwise Neural Network requires a GPU and serious infrastructure.”

2 Likes

To add in a bit here, most papers focus on quintile, decile or 1/25th portions for ranking. Realistically it is unlikely we would trade that number of stocks in a single strategy. I have found LGBM Regressors to be better with zscores in almost every case compared to the XENDCG (based on local reproduction).

3 Likes

Have you tried ordered logit regression?

It is true that it uses something like deciles for training targets.

It still ultimately orders the stocks just like P123’s ranking system. You could use sell rules like RankPos and the coefficients can be put into a ranking system.

Almost as if it were designed for P123.

Addendum on RankPos and ordering.

I have used binary classification models. Specifically Random Forest classification models on P123 data. The classes can be positive return and negative return. Rank is obtained through the probabilities The models do quite well although not as well as regression models in my experience.

For ordered logit regression the ordering of every stock comes from the probability of being in a bin. So you would pick the 15 stocks with the highest probability of being in the top bin. This is slightly simplified.

The authors discuss using binary classification probabilities and prefer it over regression in the introduction. So they do have a preference for probability methods. But they also believe ordered logit regression is an improvement over binary classification (and return-forecast models as well) which their results suggest.

1 Like

I thought I might give it a go and see if I could learn something.

First lesson: This was actually fairly computer intensive. I tried a walk-forward backtest and after a few minutes it had not finished training on the first period and I shut it off. Not sure if it would have finished if I had given it more time, but it is fairly computer intensive at best.

With subsampling of data it was not a problem getting a rank-ordering on all of that data as discussed above.

Thank you for sharing this!

Interesting. Maybe I could try running in the cloud. I have some cloud computing I can use

1 Like

Over the past few months, I have benchmarked a variety of loss functions, and MSE remains the most stable baseline for the majority of use cases. Quantile regression is yielding compelling results, so I plan to further investigate its utility in capturing specific return thresholds. Interestingly, the Tweedie distribution (bridging Gamma and Poisson) is performing remarkably well on micro-caps. Given their discrete price movements and high zero-inflation, this alignment makes intuitive sense, and I’ll be diving deeper into the parameterization there.

I’ve tried the following objectives:

Objective Type
MSE (L2) Regression
MAE (L1) Regression
Huber Regression
Quantile Regression
Tweedie Regression
Poisson Regression
Gamma Regression
MAPE Regression
LambdaRank Ranking
Binary Classification
Multiclass Classification
Focal Loss Custom
Asymmetric Custom
Spread Loss Custom

However, the most significant driver of right-tail alpha I’ve identified so far isn't a loss function at all—it's Top-Heavy Sampling. If you haven't integrated this into your training pipeline yet, do it and thank me later. :wink:

7 Likes

Have you tried goss?

For data sampling, yes. I use it in a few live strategies.

1 Like

Thanks for sharing. I was thinking of trying this out once I get some free time. I went into the agentic AI rabbit hole

I got a computer sitting on my desk ready to install a Claw, but I know, once I start going in to that rabbit hole I can write off 4 weeks of spare time :roll_eyes: Don't really have a good enough use case to put time in to it yet.

1 Like

Yeah at least 4 weeks​:joy:

OpenRouter was the most useful site I found for it

I should mention that Huber (combination of MSE and MAE) gives me very stable OOS results as well. Example below

Tested as an imported stock factor, so can't use "Force Positions into Universe", so turnover gets inflated.

2 Likes

Very impressive. Would you be open to sharing the same screenshot w/ vs. without Huber? just so we know the contribution?

Also good

Spent hundreds of hours testing different objectives. My conclusion is that regression objectives are the most stable ones, MSE, MSA, Huber , Poisson, Tweedie, MAPE... all good.
Classification objectives generates great top bucket ranking results, a bit erratic OOS results, but worth spend more time on.
Ranking objectives, I get so bad results I suspect I'm doing something wrong. Spent so many hours tweaking, different normalization methods, all methods of truncations etc, given up.

I suspect that I will be able to get the best results combining a regression and a classification model to one strategy. The possibilities are endless...

4 Likes