There is another kind of risk requiring diversification that the quants have not discussed – yet – mainly because I don’t think they’ve figured out that it exists. It’s unique to our kind of data-driven investing; i.e., it hasn’t been very relevant in the past but it probably will become mainstream in the future. It’s a version of the mis-specified model.
It’s the risk of oddball phenomenon causing a factor (a growth rate, a debt ration, a valuation metric, anything) to not really mean what a quick glance at the number leads one to think it means.
This is not about data errors or database design or policy. It’s about the reality that data cannot fully capture the infinite variety that’s out there and can never be truly perfect. There’s a widespread impression that working with data involves precision. Actually, it’s quite the opposite. It requires us to embrace the adage that its better to be vaguely right than precisely wrong. It’s why contrary to quant best practices in other fields, for us, we’re better off using lots of factors and even embracing redundancy or correlations among factors. We need protection against oddball data readings. (So five ways to define growth are better than one.)
Similarly in portfolios, we have to recognize that despite our best efforts, our mode4ls will still pull in some stocks we don’t really want. A 20 stock portfolio based on a carefully constructed model gives us a better chance of properly implemented our idea than a five stock portfolio even with the same model.
So I think number of factors is relevant (although not in the way others discussing R2G have believed; to me, more is better from a risk control perspective), as is number of stocks (more is better – to a point, and figuring that out is the hard part). I’ll try to phrase it for the Google doc later.