I’m sure there’s a way to word this in less words but my brain’s having trouble…
Here’s my thought:
As I understand it, when we rank stocks the underlying metrics we rank them on are relative to the percentiles around it. My portfolio may average a PB of 1 this year but may average higher another depending on market conditions but also how all investors are investing. Studies show people tend to crowd the most effective strategies over time as education increases and more studies are released. This makes the distribution of a metric, say price to book, drift over time. Your cheapest 10% companies may have a PB closer to the cheapest 20% or closer to the average - (and that next 10% of companies likely has much better other metrics). So we’re back testing and choosing percentiles that have factor values that are always shifting around. Wondering if it would be better to create a regression that predicts the expected return using the actual value of the factor. I know we can do regressions of ranks, but again that rank may represent something quite different 10 years from now if many people crowd the strategy. This is also why many factors have a bell curve when we plot the returns I suspect - high EPS growth stocks have too much attention and the top 10% of them having stretched valuations from overcrowding.
Is it possible to create a predictive model that using factors and their underlying values as the X value and the annual return as your Y, without using ranks?
Is that the purpose of the AI? Is that the only way to create a predictive formula? Does P123 use ranks to predict or does it run regressions on the factor value itself?
Sometimes there are moments in history when a growth stock has almost the same valuation ratio as a value stock, when the distribution is very narrow - in those moments it’s probably better to buy a growth stock I imagine: for example figure 4 around year 2009/2010 Growth vs Value - Yardeni Research
Ideally I’d want to give a factor less weight on my model when distributions narrow, and more when distributions expand. If a growth stock has a similar PE to a value stock, I’d like to weight growth higher automatically then. I think a foolproof model would account for factor timing at least in the regards of valuations drift and overcrowding.
Building on this - Is it possible to add a node on a p123 model that checks what the average value for a factor is for the top 10% rank, then adjusts its weight accordingly?