I built my first regression-based model 35 years ago, back when we called them ranking systems instead of models. My employer’s main ranking system worked reasonably well on a large equal-weighted universe, but added almost no value relative to the capitalization-weighted S&P 500, which was the benchmark for an important client portfolio. A talented programmer built a multiple regression program to run on the company’s mini-computer, which took almost 48 hours to run a pooled regression covering 20 years, with one six-month observation period per year. One reason for the slowness was our decision to cap-weight the data (simply by dividing the universe into size groups and entering the data once for the smallest stocks and up to 20 times with increasing market cap). We also broke with the firm’s past practice by choosing to model total return rather than price change (those were the days). The result was a model that was value-oriented, in sharp contrast to the firm’s established ranking system, which favored price momentum, earnings momentum, and earnings surprise. The timing was good, because the LBO boom of the 1980s was getting into full swing.
I moved on less than two years after developing my large-cap model; the client terminated the account; and my work was forgotten. But I was left with a strong conviction that multiple regression can be an excellent tool to determine the factors that have predicted relative performance in the past.
Since my first effort, I have worked on multiple models that were designed to provide a disciplined structure to support fundamental investment decisions. I built my second large-cap and first small-cap model, with a colleague, using Systat. Later, I delegated the computer work to more talented colleagues, first to one who wrote his own Fortran code and then to one who is a master of R. But I remained closely involved in setting the research agenda and evaluating its results until my retirement at the end of 2013.
Over the course of my investment career, turnaround time for a regression study went from days, to hours, to minutes. But as the work became easier, competition increased, and potential alpha diminished. The sustained outperformance of mega-cap growth stocks aggravated the problem, because most valuation metrics had little or no predictive power for their performance. At the time of my retirement, our large-cap model had added almost no value in the previous couple of years.
Regression works very well in predicting the past. But it needs to be used carefully and intelligently. So let me try to address some of the issues in the recent thread, started by yuvaltaylor, which prompted me to reminisce.
First, collinearity is a problem for scientists, not for practitioners. Our independent variables are inevitably correlated, which means we can’t attribute statistical significance to any one of them. And anyway, stock models struggle to explain even 5% of the variance in returns. So the only concern for practitioners is whether correlations among independent variables are stable enough to produce useful forecasts. In my experience they are, but I have worked only with predictive factors that have an established investment rationale. Data miners might have a different experience.
Data quality matters a lot. A single hugely negative book value figure once blew up one of my models. Problems can be largely avoided if predictors are ranked or normalized (forced into a normal distribution). My experience is that the actual data distribution has value, but that means working to control outliers. I recommend winsorizing to deal with any absurd values, followed by several iterations of standardizing and winsorizing. Excess total returns should be logged and possibly winsorized. Regression is sensitive to leverage points.
Use a forecast horizon that makes sense for your predictors. I find six months reasonable for most fundamental variables. Even estimate revisions, whose effect seems to dissipate quickly, tend to have a later second life, probably because of serial correlation. The effectiveness of a deep-value factor like price-to-sales can increase over several years.
Don’t worry about having overlapping observation periods, since you’re not trying to prove statistical significance. The more starting points you can test, the better. I recommend running regressions on each test period and averaging the coefficients rather than pooling all the observations. Pooling can reduce the impact of outliers, but it can distort results if the number of observations per period varies. And I feel it’s important to see how the coefficients change over time.
If your results are too good, something is wrong. I once found that a surprisingly effective long-term price change factor was subtly biased in favor of stocks that would have future splits. More commonly, risk factors often look compelling in bull markets.
Aside from the difficulty of doing it right, the biggest drawback I see in multiple regression is that it explains all the variance in returns, but I’m only interested in one or two of the tails of the distribution. In building ranking systems on this platform, I have therefore used rolling backtests of top decile performance to optimize the weights of individual factors and composite nodes. Based on this experience, as well as my past experience with multiple regression, I am convinced that equal weighting factors is suboptimal.
Finally a pitch for two recent feature requests. My rolling backtests would be much easier to do if I could specify the equal-weighted selection universe as my benchmark.
See: https://www.portfolio123.com/mvnforum/viewthread?thread=11356
And testing a large-cap ranking system in a portfolio would work much better if we could use the active (benchmark-relative) weight to set the position size. After all, active weights determine benchmark-relative performance.
See: https://www.portfolio123.com/mvnforum/viewthread?thread=11355