What stats do you want for the Model Leaderboard?

SZ · September 21, 2025, 12:29pm

Some risk ideas:

Downside Volatility (Volatility of negative returns).
Max overall historical drawdown- to help assess potential downside volatility. Max monthly or quarterly negative return could be insightful too.
Average bid/ask spread of stocks in simulation. The lower the better. Generally a model that performs the same but with higher liquidity should be more desirable

Remove Sharpe:

Flawed metric as upside volatility is not necessarily a risk.

If you want a substitute then downside deviation relative to the universe or benchmark can be chosen.

How is the risk score calculated?

“Risk Scores range from 1 to 5, with 1 being the least risky. The model's Risk Score reflects the combined impact of three criteria:

three month volatility (lower is better)
three month maximum drawdown (lower is better)
number of positions (higher is better)

The scoring of each criteria and the overall score is done on a relative basis: the values are sorted then assigned a percentile depending on the order (not the magnitude).

(i) Since scoring is done on a relative basis, using only three months for statistics like volatility was enough to correctly classify models.”

Thoughts:

Is it just three months of performance for the stats or is it over a longer period with 3-month samples? If it is just three months total then:

three month volatility (lower is better) (should be at least a year or more and use downside volatility. Ideally it could be over the entire live period, but it should be more than 3 months)
three month maximum drawdown (lower is better) (Should be over a longer timespan too)
number of positions (higher is better) agreed, but perhaps should only penalize under a specific number of stocks say 50 or 100 without extreme reward for say a 500 stock model. In other words, cap the score if a model has a high enough number.