Tom, I apologize for the late reply. Thank you for your thoughtful post, let me flesh-out what I meant in my own antecedent post.
In the OP Denny talks about “ranking” R2G ports. I think he implies, and others implicitly agree with him, that ranking will help protect R2G subscribers from subscribing to, or help them drop, lemon R2G ports.
Definition of Lemon R2G port: a port which consistently under-performs its benchmark index “excess” or benchmark ETF “excess” OOS.
This is a subjective definition, no end point for judgement is specified, but such a port will identify itself as it loses subscribers, it’s ignored and possibly removed by its designer and sent to the R2G graveyard.
Like porn, I think lemon ports can’t be defined but we’ll all know them when we see them.
I introduce a new concept to help with the sisyphean task of managing a portfolio of R2G ports: Personal exclusion of some ports from consideration for ranking, at least for the time-being.
I certainly agree, and did not intend to imply otherwise, but it’s presence is bad.
As R2Gs mature with more OOS, often now at a year for many models or nearly two for some models, I am using a crude down-and-dirty tool that looks at 2014 as a block of excess OOS that can be compared to excess performance of previous years, which are generally in-sample. For some ports 2013 is mostly OOS too. If 2014 is the worst (or best) year ever for excess performance, I consider the risk of overfitting too high, and reject it from further consideration for subscription at this time.
It is just one tool, not sufficient of itself, that acts as a sensitive but not specific test to detect the possibility of undue risk of over-optimization/over-fitting. Notice my emphasis on the word “risk” I’m not saying I know with certainty over-fitting is present in a port or not.
I leave aside the issue of whether or not there is something else causing worst-case-ever-yearly-excess-performance, for 2014, for many R2G ports, of many different types.
I have yet to encounter a R2G port having its best-year-ever excess performance in 2014.
For me, I simply reject the R2G port in question at this time for inclusion in my portfolio of R2G ports. In the fullness of time, with more OOS, this down-and-dirty approach will become unnecessary as the port declares itself a lemon, or not.
Overfitting: Define it, Identify it
When I did my Master of Science in Computer Engineering I worked with backpropagation neural networks (BPNN). The nodes in the network, like those of P123, are the “rules” for learning to predict an outcome. Too few nodes, the system can’t learn to predict. Too many, and the network memorizes what worked in the past but the trade-off is the BPNN can’t generalize enough to predict what will likely happen in the future, i.e; it STILL can’t learn. It has overfit the data, it’s useless. Finding an optimal number of nodes (rules) can be done with the tools of artificial intelligence but a discussion of this would be long, and such tools currently cannot be employed on P123.
So, like artificial intelligence, do some R2G ports have so many rules that they memorize what worked in the past but simply can’t generalize enough to predict what will likely work in the future?
Looking at something like Filip’s SuperValue ranking system, with 20+ ranking rules, I worry. I never saw a good BPNN with that many rules. It’s not unreasonable to assume some R2Gs may have too many rules and overfit.
Let’s look at Hemmerling Value Rockets, launch date April 3, 2013, to show my thinking as a prospective client of Hemmerling.
The Sortino, Sharpe, performance and so on are all great. Worthy of consideration so far.
Model 45 72 129 58 44 82 19 -18 267 65 31 60 71 27
Bench -13 -23 26 9 3 14 4 -38 23 13 0 13 30 13
Excess 58 96 102 49 41 69 15 21 244 52 31 47 42 13
Sorry these do not line up properly. But last number from each row is last column.
The last column is year 2014. It is the worst year ever for excess performance for the port, a negative in my view. But in this case, year 2013 is mostly OOS, and it is a good year, outperforming four previous years. The way I apply it, the tool cannot say there is a risk of overfitting.
Hemmerling High Yield Russell 1000. 2014 is not the worst year. 2013 is mostly OOS and outperformed six previous years. It shows good year-to-year variability. No discernible risk of over-fitting.
Alpha Max - 10 Large Cap Stocks w/ Improved Metrics-V4 - No Hedge . Launched May 2013. Same situation as Value Rockets. Most of 2013 is OOS, in my view, and compares well to previous years, no risk of over-optimization identified this way. 2014 outperformed 2010. V5: too little OOS data, really, but as of this writing 2014 matches 2010. Pass on it [/i]for now[/i] just because of too little OOS compared to peers.
Tom SX20 launched Oct 11, 2013. Most of 2013 is in-sample. 2014 looks poised to outperform 2012, but not by much. From my point-of-view, weak risk of over-optimization.
Tom’s SX10 launched Sept 2013, so 2013 is mostly in-sample. 2014 excess performance: worst year ever. As a prospective client of this R2G port, I would let 12 months go by before studying it again. A lot of R2G ports currently fit this same yearly performance profile.
As the prospective client I can’t take the risk that it is, if other ports show better variability and have other performance metrics that are comparable to or even better than this one, I would spend my time examining other ports more closely.
Let me choose one. I will keep it anonymous. Launch date April 14, 2014
426 187 111 153 263 151 134 180 209 322 151 175 123 50 63 -6
30 6 -14 -14 24 12 22 15 7 -35 31 14 -11 4 10 7
396 181 125 167 238 139 112 165 202 357 120 161 135 46 53 -13
Sortino is 8.42! Yeah, sure, this port looks like the answer to my prayers. Last column is year 2014. But my down-and-dirty tool is screaming “high-risk” at me.
Sure, I’ll look at it again in a year, but I doubt I will ever subscribe to it.
Let’s look at a couple of individual designers. I appreciate Marc Gerstein more. Only one of his seven R2G ports is flagged by my method as at risk for over-fitting.
DennyHalwes. All four ports show the same profile. 2013 and 2014 are both OOS, for my practical purposes. 2014 is always the worst year ever for excess performance. 2013 beats out just one other year in all four cases. From my point-of-view, weak risk of over-fitting in all cases. I would look at them again in a year.
I appreciate your candour. I suspect there is something unique about 2014, not just over-optimization, that is degrading performance of many ports despite a variety of investment themes. Marc Gerstein seems unaffected by this phenomenon, whatever it is. But that topic is outside the scope of this post.
Your candour leads me to the nature of the challenge confronting the prospective R2G client: it’s the Used Car Problem from Game Theory. The vendor has more knowledge than the buyer, the buyer is at a disadvantage, so he must use indirect means to obtain information to support a decision. He doesn’t want to buy a lemon.
I do not disagree with any of this.
I employ several different tools to evaluate ports and support my subscribing decisions, most already mentioned in this thread. As time passes, I learn more and circumstances change, so I continuously adjust my portfolio of R2Gs. There is more material to work with now than there was when R2G started almost two years ago, more opportunity, but it is still buyer beware.
In conclusion, Denny started us off looking at ranking R2G ports. I look at the issue as one of what decision support tools do I have, and how do I use them, to design and maintain a portfolio of R2G ports in real-time.
Cheers Randy