Add Support for LightGBM’s XENDCG Objective (Better Ranking)

hbee · March 18, 2025, 11:42pm

Hey everyone,

Wanted to throw out a feature request that I think could really level up the AI Factors platform — specifically, adding support for LightGBM’s rank_xendcg objective.

Here’s the quick rundown:

Why Bother?

Most of us are used to optimizing stock return models using RMSE or similar regression losses. But there’s some pretty convincing research showing that ranking objectives actually perform better when you’re building systematic stock selection strategies.

Poh et al. (2020) showed that using LambdaMART (an implementation of LambdaRank) can improve out-of-sample Sharpe ratios by 3x compared to standard RMSE-based models. Pretty big jump. (Check out Exhibit 2 in their paper.)
Bruch (2021) took it a step further, introducing XENDCG (Expected NDCG) — it’s a listwise ranking loss that not only performs better but trains faster and more stable than LambdaRank.

Given that cross-sectional ranking (not point predictions) is the bread and butter of a lot of strategies here, I think this could benefit everyone.

What’s Blocking It Right Now?

When I try to set:

"objective": "rank_xendcg"

LightGBM throws this error:

LightGBMError: Ranking tasks require query information

Looks like P123 isn’t passing the group/query info that LightGBM needs for ranking objectives to work.

The Ask:

Could we get:

Support for rank_xendcg in LightGBM
Proper handling of the query/group data that LightGBM needs for ranking tasks

Why It’s Worth It:
It’s a cleaner fit for cross-sectional long-short models.
Faster, more stable, better ranking — backed by solid research.
Opens up a lot of cool opportunities for building systematic strategies focused on rank rather than raw returns.

Would love to hear thoughts from others — has anyone else tried using ranking objectives elsewhere? Curious if anyone’s worked around this limitation or if the dev team has thoughts on feasibility.

References:
Poh, Daniel, et al. Building cross-sectional systematic strategies by learning to rank. arXiv preprint arXiv:2012.07149 (2020).
Bruch, Sebastian. An alternative cross entropy loss for learning-to-rank. Proceedings of the Web Conference 2021.

Thanks all!
Henry

tiltonhouse · May 6, 2025, 6:16pm

Hi, I've just rejoined P123 after many (10?) years away. What brought me here was the recent introduction of the AI functionality. I'm just getting started with P123 but i have had some success prior to joining P123 with using xgboost with just price series data ( momentum and volatility factors ) with the ranking objective rank:ndcg which approximates the LambdaMART. I think the LGBM method is similar so i would like to see this added to either xgboost or LGBM.
Thanks!

Hedgehog · June 3, 2025, 8:06pm

AI (ChatGPT) supports this feature request with a +1 , see redacted quote below:

Below are peer-reviewed papers, theses, and vendor documentation that directly back the claim that learning-to-rank (LtR) objectives such as lambdarank / rank_xendcg outperform ordinary regression losses for cross-sectional stock picking.

#	Reference (chronological)	Key finding relevant to cross-sectional equity ranking
1	Poh, D., Lim, B., Zohren, S., & Roberts, S. (2021). Building cross-sectional systematic strategies by learning to rank. Journal of Financial Data Science, 3(2), 70–86. doi.org	Using LambdaMART–style list-wise LtR models on daily cross-sectional momentum signals “boosts Sharpe Ratios by ~3× relative to sorting regression or classification outputs.”
2	Poh, D. W. C. (2023). Constructing cross-sectional trading strategies: A machine-learning approach to learning to rank (Doctoral dissertation, University of Oxford). University of Oxford. robots.ox.ac.uk	Across 20 years of global equities, list-wise LtR models “consistently rank the top decile more accurately than gradient-boosted regression, leading to materially higher realised returns.”
3	Kouloumpris, E., Moutsianas, K., & Vlahavas, I. (2024). SABER: Stochastic-aware bootstrap ensemble ranking for portfolio management. Expert Systems with Applications, 249, 123637. doi.org	Presents an uncertainty-aware LtR ensemble that outperforms linear-regression and standard LtR baselines on ranking accuracy, delivers higher long-short returns, and lowers downside risk.
4	Quantitativo. (2025, March 1). Learning to rank enhances cross-sectional strategies. Quant Trading Rules. quantitativo.com	Independent replication of (1): LambdaMART “tripled the strategy Sharpe vs. classic cross-sectional momentum built by regression.”
5	LightGBM Developers. (2025). lightgbm.LGBMRanker (Version 4.6) [Computer software documentation]. lightgbm.readthedocs.io	States that ranking objectives are “mainly for training and applying ranking models,” distinguishing them from `LGBMRegressor`; the library ships dedicated list-wise losses (`lambdarank`, `rank_xendcg`) specifically for ordered-list problems like stock selection.

Take-away

These sources demonstrate—both theoretically and in large-scale empirical tests—that pair-wise / list-wise LtR losses align directly with the goal of ordering stocks each day, and empirically deliver higher top-bucket alpha and Sharpe than training the same booster with a plain regression objective.

That is why, once Portfolio123 exposes a “Query/Group” field, switching the JSON "objective" to "lambdarank" (or "rank_xendcg") is expected to outperform any regression-type stock-picking tasks.

bobmc · June 4, 2025, 4:10pm

Thanks for this thread.

Just when I think I’m getting to know ML I rediscover what a small fraction I actually know!

As one YouTube presenter stated: “The ML Problem You've Probably Never Heard Of

regression and classification ... but have you heard of this? Learning to Rank.

I especially didn’t think this could apply to predicting the best stock to purchase but I was probably wrong.

It sort of took me a while to get my head around what they are actually doing and I’m still coming to grips

with the concept. However, it does seem to solve one of the flaws I have observed with regression and tree

solutions. They do a very good generalized prediction of typical stock performance but at least for me don’t

pick up any of the out of the ordinary situations which are infrequent but profitable.

Although this technique isn’t an obvious method to use for stock selection there are enough comments and

papers on the better results for stocks that it looks worth pesuing.

One Thesis from believe it or not Khalifa University of United Arab Emirates. It is technical price indicator based, completely different than the primary fundamental approach we are using. But the thesis had an nice introductory overview of the general approach, flow diagrams and even some example code.

file