Paper --> Design choices, machine learning, and the cross-section of stock returns

Interesting Paper:

2 Likes

The findings generally support (with some exceptions) things I have observed:

  1. Expanding windows are usually the best choice.
  2. Regression methods outperform classification methods.
  3. Using excess returns relative to the "Market" as the target is crucial.

Regarding the last point, how do they define the "market"? Specifically, is it equal-weighted or cap-weighted?

I believe it is cap-weighted, and I wonder if this is one reason their findings differ somewhat from what many of us at P123 observe—namely, that adding micro-caps is not beneficial (in the paper but not in our experience).

The use of cap-weighting makes the impact of micro-caps on their definition of "market" negligible. This likely means there is considerably more noise in the excess returns of micro-caps, especially to the extent that their returns are not correlated with the cap-weighted "market."

While not proven, for P123 members who invest in micro-caps, using equal weighting for the universe is probably a better approach. In practice, this suggests that excess returns relative to cap-weighted benchmarks like the Russell 2000 or Russell 3000 may not be ideal for micro-cap or even all-cap strategies. At P123, even all-cap strategies tend to include many micro-caps and small-caps for various reasons, further emphasizing the potential limitations of cap-weighted benchmarks.

3 Likes

Summary:

Use relative strength as Target (3- 12-month, agree if you emphasize total return with about market risk)

use nonlinear ML models (agree, with Rank as a preprocessor)

use long(er) time frames to validate and test (agree)

Trend, Momentum, Beta, Analyst Earnings Revisions are most important factors (in this study --> agree on estimates revisions and momo)

"In contrast, post-publication adjustments, feature selection, and training sample size have minimal impact on the outperformance of non-linear models. These findings indicate that more complex machine-learning models require larger training datasets to robustly capture non-linearities and interactions in the data."

very interesting: This is pure gold, it means even after factor publication, non-lin MLs can squeze out alpha + they are good at selecting the most important factors...

agree --> best results in my AI models --> long training periods (with basic holdout 2004 - 2019 and then prediction training from 2014 - 2019, then OOS test with the predictor 2019 - today).

1 Like

Other findings (in general):

Small and micro caps -->
LightGBM does best (does well with sparce data!) + one needs to restrict the universe by selecting micro to small caps

Findings on big caps
Extratrees I to III does best

Best Features --> Small and Micro Cap Focus (does very well on small and big caps), Core: Sentiment (well on mid to big caps).

Using predictors that have been trained on one universe and then use it on another --> results are not good

2 Likes

Jrinne --> I agree on the small caps, in academic studies they "must" cap weight (just standard for a reason, they want to produce somewhat scalable models).

But then one does not capture the small size and low liquidity effect, which can be used for smaller accounts.

3 Likes