I’ve now done even more correlation studies, and I’ve come to the conclusion that OLS alpha probably still provides the best correlation between an 8-year 100-stock simulation and the subsequent 3-year performance of a 20-stock portfolio, but only by a whisker. LAD alpha is good, CAGR is good, median and mean excess monthly returns are both good, Theil Sen estimation is good. While OLS alpha usually leads the pack in my correlation tests, the others are off by less then 0.03 (the difference between the correlation coefficients).
In my latest tests, I first tested 40 radically different strategies–different rankings, different universes, different holding times. The results (the average of plain correlation and rank correlation) were OLS alpha 0.781, CAGR 0.775, LAD alpha 0.767, mean excess returns 0.763, median excess returns 0.758, and alpha-sigma ratio (OLS alpha divided by s.d.) 0.725. I then tried testing 30 very similar strategies, the kind of strategies I use in my everyday trading, to see if one method was better at catching the little differences between them. Now the results were almost the reverse: median excess returns 0.580, LAD alpha 0.573, mean excess returns 0.570, CAGR 0.565, OLS alpha 0.557, and alpha to sigma 0.419. But with the exception of alpha-sigma, the correlations are all so close that I don’t think one can make a definitive judgment of which is the best. As for Theil-Sen estimation, it’s very close to median excess returns, and it’s extremely cumbersome, so I’m just guessing it’s in the same range. I’d rather not use it because it’s so difficult to manage.
This brings me to Jim’s suggestion of a paired t-test. I may be wrong, but from what I’ve read, the number that I’d be looking for with a paired t-test is Cohen’s d, which is basically the mean excess return divided by the standard deviation of the mean excess return, because that’s the measure of how powerful the effect is. Now whenever I divide anything by the standard deviation, I end up in trouble. My correlation coefficient drops by at least 0.05, and sometimes as much as 0.2. For a while I was really taken with the alpha-sigma ratio (partially because I invented it myself), but correlation just isn’t as high as plain old alpha. And you’ve probably already heard about my gripes with the Sharpe ratio.
Now if I’m wrong about the t-test, I’d be happy to give it another try.