How I Rank R2G Ports

Tom, I apologize for the late reply. Thank you for your thoughtful post, let me flesh-out what I meant in my own antecedent post.

In the OP Denny talks about “ranking” R2G ports. I think he implies, and others implicitly agree with him, that ranking will help protect R2G subscribers from subscribing to, or help them drop, lemon R2G ports.

Definition of Lemon R2G port: a port which consistently under-performs its benchmark index “excess” or benchmark ETF “excess” OOS.

This is a subjective definition, no end point for judgement is specified, but such a port will identify itself as it loses subscribers, it’s ignored and possibly removed by its designer and sent to the R2G graveyard.

Like porn, I think lemon ports can’t be defined but we’ll all know them when we see them.

I introduce a new concept to help with the sisyphean task of managing a portfolio of R2G ports: Personal exclusion of some ports from consideration for ranking, at least for the time-being.

I certainly agree, and did not intend to imply otherwise, but it’s presence is bad.

As R2Gs mature with more OOS, often now at a year for many models or nearly two for some models, I am using a crude down-and-dirty tool that looks at 2014 as a block of excess OOS that can be compared to excess performance of previous years, which are generally in-sample. For some ports 2013 is mostly OOS too. If 2014 is the worst (or best) year ever for excess performance, I consider the risk of overfitting too high, and reject it from further consideration for subscription at this time.

It is just one tool, not sufficient of itself, that acts as a sensitive but not specific test to detect the possibility of undue risk of over-optimization/over-fitting. Notice my emphasis on the word “risk” I’m not saying I know with certainty over-fitting is present in a port or not.

I leave aside the issue of whether or not there is something else causing worst-case-ever-yearly-excess-performance, for 2014, for many R2G ports, of many different types.

I have yet to encounter a R2G port having its best-year-ever excess performance in 2014.

For me, I simply reject the R2G port in question at this time for inclusion in my portfolio of R2G ports. In the fullness of time, with more OOS, this down-and-dirty approach will become unnecessary as the port declares itself a lemon, or not.

Overfitting: Define it, Identify it

When I did my Master of Science in Computer Engineering I worked with backpropagation neural networks (BPNN). The nodes in the network, like those of P123, are the “rules” for learning to predict an outcome. Too few nodes, the system can’t learn to predict. Too many, and the network memorizes what worked in the past but the trade-off is the BPNN can’t generalize enough to predict what will likely happen in the future, i.e; it STILL can’t learn. It has overfit the data, it’s useless. Finding an optimal number of nodes (rules) can be done with the tools of artificial intelligence but a discussion of this would be long, and such tools currently cannot be employed on P123.

So, like artificial intelligence, do some R2G ports have so many rules that they memorize what worked in the past but simply can’t generalize enough to predict what will likely work in the future?

Looking at something like Filip’s SuperValue ranking system, with 20+ ranking rules, I worry. I never saw a good BPNN with that many rules. It’s not unreasonable to assume some R2Gs may have too many rules and overfit.

Let’s look at Hemmerling Value Rockets, launch date April 3, 2013, to show my thinking as a prospective client of Hemmerling.

The Sortino, Sharpe, performance and so on are all great. Worthy of consideration so far.

Model 45 72 129 58 44 82 19 -18 267 65 31 60 71 27

Bench -13 -23 26 9 3 14 4 -38 23 13 0 13 30 13

Excess 58 96 102 49 41 69 15 21 244 52 31 47 42 13

Sorry these do not line up properly. But last number from each row is last column.

The last column is year 2014. It is the worst year ever for excess performance for the port, a negative in my view. But in this case, year 2013 is mostly OOS, and it is a good year, outperforming four previous years. The way I apply it, the tool cannot say there is a risk of overfitting.

Hemmerling High Yield Russell 1000. 2014 is not the worst year. 2013 is mostly OOS and outperformed six previous years. It shows good year-to-year variability. No discernible risk of over-fitting.

Alpha Max - 10 Large Cap Stocks w/ Improved Metrics-V4 - No Hedge . Launched May 2013. Same situation as Value Rockets. Most of 2013 is OOS, in my view, and compares well to previous years, no risk of over-optimization identified this way. 2014 outperformed 2010. V5: too little OOS data, really, but as of this writing 2014 matches 2010. Pass on it [/i]for now[/i] just because of too little OOS compared to peers.

Tom SX20 launched Oct 11, 2013. Most of 2013 is in-sample. 2014 looks poised to outperform 2012, but not by much. From my point-of-view, weak risk of over-optimization.

Tom’s SX10 launched Sept 2013, so 2013 is mostly in-sample. 2014 excess performance: worst year ever. As a prospective client of this R2G port, I would let 12 months go by before studying it again. A lot of R2G ports currently fit this same yearly performance profile.

As the prospective client I can’t take the risk that it is, if other ports show better variability and have other performance metrics that are comparable to or even better than this one, I would spend my time examining other ports more closely.

Let me choose one. I will keep it anonymous. Launch date April 14, 2014

426 187 111 153 263 151 134 180 209 322 151 175 123 50 63 -6

30 6 -14 -14 24 12 22 15 7 -35 31 14 -11 4 10 7

396 181 125 167 238 139 112 165 202 357 120 161 135 46 53 -13

Sortino is 8.42! Yeah, sure, this port looks like the answer to my prayers. Last column is year 2014. But my down-and-dirty tool is screaming “high-risk” at me.

Sure, I’ll look at it again in a year, but I doubt I will ever subscribe to it.

Let’s look at a couple of individual designers. I appreciate Marc Gerstein more. Only one of his seven R2G ports is flagged by my method as at risk for over-fitting.

DennyHalwes. All four ports show the same profile. 2013 and 2014 are both OOS, for my practical purposes. 2014 is always the worst year ever for excess performance. 2013 beats out just one other year in all four cases. From my point-of-view, weak risk of over-fitting in all cases. I would look at them again in a year.

I appreciate your candour. I suspect there is something unique about 2014, not just over-optimization, that is degrading performance of many ports despite a variety of investment themes. Marc Gerstein seems unaffected by this phenomenon, whatever it is. But that topic is outside the scope of this post.

Your candour leads me to the nature of the challenge confronting the prospective R2G client: it’s the Used Car Problem from Game Theory. The vendor has more knowledge than the buyer, the buyer is at a disadvantage, so he must use indirect means to obtain information to support a decision. He doesn’t want to buy a lemon.

I do not disagree with any of this.

I employ several different tools to evaluate ports and support my subscribing decisions, most already mentioned in this thread. As time passes, I learn more and circumstances change, so I continuously adjust my portfolio of R2Gs. There is more material to work with now than there was when R2G started almost two years ago, more opportunity, but it is still buyer beware.

In conclusion, Denny started us off looking at ranking R2G ports. I look at the issue as one of what decision support tools do I have, and how do I use them, to design and maintain a portfolio of R2G ports in real-time.

Cheers Randy

rallan - thank you for this knowledgeable post.

I have some experience with neural nets (NNs) but probably not as much as you. I have a couple of comments…

  • While choosing too few or too many nodes will lead to poor results, choosing the “ideal” number will not give OOS results that are as good as backtest.
  • OOS results will degrade with time. A “rule of thumb” is one year OOS for every 4 years of in-sample data optimization. Beyond that, one is pushing one’s luck. So if a model was specifically optimized over the last 5 years then one could expect it would continue to “work” for about 1.25 years assuming no regime change (such as fast dropping oil prices).
  • Nodes are internal to the NN. The “ideal” number of nodes is directly related to the number of inputs. We don’t have an equivalent to “nodes” in our ports.
  • Ranking factors are “inputs” not “nodes” and the quantity of factors should not be judged by the same criteria as for nodes.
  • NNs are only as good as the inputs. i.e. garbage in, garbage out. This is why I gave up on NNs. If you can identify good inputs then why do you need the NN? :slight_smile:

I think that one of the problems you and others have is that you believe that there should be a strong relationship between in-sample and out-of-sample results. Some model providers may design with this assumption in mind. But to make this general conclusion is wrong. If you truly like Marc Gerstein’s results then listen to what he has to say about backtest :slight_smile: It is not outside the scope of this thread.

As for how you choose R2G ports, it looks to me as if you are choosing models that are performing the best. This can very easily lead to buy high/sell low. Just as an example, many of the smallcap models that performed exceptionally well the first six months after R2G started, subsequently flopped.

Steve

Further to what is being said, it seems that the Piotroski model has failed in 2014 when it did well earlier. I know my two are doing poorly, the R2Gs are not performing and neither is AAIIs. Can overfitting be also a general model design theory? Like factors based upon management effectiveness? Or concentration on a sector like health care or focusing on high dividend stocks in a low interest environment? Since we cannot see what is inside of R2Gs, it may be selecting stocks on macro trends that have worked well over the last 5 years but now the general market circumstances have changed. I personally think that Piotroski failed because it led to an over reliance on energy stocks. Does that mean that his concept was bad or just the types of stocks selected moved against it. Could be the same of over reliance of selecting small cap versus large cap stocks. Small cap were hurt this last year too. I guess I am saying that the fundamental ideas in the model could be fundamentally effective over the last 15 years, on average, but failed in 2014, screwing up the OOS results. I think just looking at 2014 could be misleading. I think developers do need to talk about more what are the drivers for the market.

David,

I agree with your general point, but I’m also laughing because one of my top live money systems this year is a Piotroski variant. My second one is a microcap system. They’ve done much better than the ‘safer systems’ I launched as R2G’s. And much better then their indexes. My worst performing system is an SP500 focused system - and that had the best underlying index performance and chance of doing well this year.

@Rallan,
Thanks for the long reply. Good luck. Over the years, I’ve come to the conclusion that I have close to ZERO idea how any pro manager I choose, or model I invest in will do in the next 12 months. I can choose managers (and systems) I will exclude. Those rules are easier. And I’ve vetted at least 500 pro managers in the alternative space and invested in a handful (but tracked dozens - over more than a decade). What I do (sometimes) know is the general conditions in which a system will struggle and a likely range of outcomes I am willing to accept. All I can do is try to build a basket of systems for myself where I understand how they fit together… and then monitor systems / managers to see that they stay within a ‘range’ of expected performance.

If someone wanted to chase hi-performance in microcaps through a basket of such R2G’s (I stopped offering these because there are a lot, and if I really believe in a system I don’t want to compete with sub’s, but this is the dominant R2G money maker), that would be an okay approach. But, I would expect that most of them would return around the bench not counting fees. Some number (maybe 20%) would lose a decent amount of money and some number (hopefully 10-20%) would have big up years. So, in total you could make money over the bench. But, I would not expect them to have much stability year-to-year in terms of total rankings (unless liquidity is really low, i.e. under $300k ADT100, then it might). The issue is the fees relative to the amount you can invest in them. There is money to be made here, though. But I don’t think it will come by timing allocations. I think it will come by recognizing that the systems will, nearly all, underperform backtest results over a rolling 3 year period… but looking for systems that fit together (whatever that means to you)… or get you to a realistic number of holdings (say 30).

My bet is also that there are some very good R2G designers - better than most mutual fund managers, and better, in some cases, then ‘big name’ hedge funds. At least after fees. These are people who will give an ‘average’ P123 sub a better chance of beating the market then if they do things on their own. But, only if the fees make sense relative to dollars invested. And only if people can find them. And build a well constructed portfolio ‘blend’ of them. That’s very hard also for many weekend investors.

I can’t disagree with you waiting to see on any of my systems. But, if they do a 20% benchmark outperformance (or underperformance) year, it is my belief that they are no more likely to do well (or poorly) the following year. At least that’s my experience.

For ‘fun’, I made a little deck to reflect on this:
https://docs.google.com/presentation/d/1ijxx3csXW-6USn5NOGqHG08NgBim448St_8_fHhwWpA/pub?start=false&loop=false&delayms=3000

I still think the hardest issues for anyone, whether we built the system or not, is predicting it’s forward 1-2 year performance. I wish I could pick the ‘forward year’ winners from among my own models. I can’t.

Best,
Tom

Steve,

That’s usually true. But as long as the OOS outperforms the benchmark in real time, I would be happy with such a BPNN. In his thesis paper, Vanstone was able to get annualized returns of 30% from the Australian stock market with BPNNs. And this was on the test sets, not just the training sets. He was a beginner, using just Benjamin Graham rules, with an implementation that in my view tapped into only a fraction of the potential of NNs and AI, yet he did quite well on his first attempt!

I agree. But this can be dealt with. A moot point as P123 does not incorporate this form of AI.

I really didn’t want to get into a technical discussion of BPNN, these were to serve as an example of AI tackling the issue of overfitting. The input layer takes in the first row of the raw data, i.e; price, earnings, previous price(s), volume, etc. The weights to-and-from the middle layer of nodes is the AI equivalent of human rules, in an abstract way. BPNNs, and most types of NNs, can handle data non-linearly, P123 is strictly linear.

Hhhhmmmm. An answer to this gets too long and technical, and is outside the scope of the thread. Read Vanstone’s thesis.

This is patronizing me, I hope this is not what you meant to do. In all cases the in-sample data should provide some useful information to help judge what future results should be, otherwise why would P123 provide in-sample data as it now does?

If this were true I would be anxious to subscribe to the anonymous R2G port with a Sortino over 8, described above by me. As I said, I doubt I ever will.

Randy

Tom,

I feel sometimes I should just put the list of open R2G ports on the wall, have a monkey throw five darts at it, then pick the five it hits. Then sell the monkey.

Randy

Seems like I set off an avalanche when reviving the thread after it lay dormant for a year.

Good discussions here.

It’s important to remember the the premises of working with P123.
We are restricted to a 16 year timeframe, heavily exposed to in-sample performance stats and flooded with options to choose from to increase (in-sample) performance.
The platform delivers outstanding possibilities compared to the options of the average investor and still holds as a dedicated approach compared to how professionals pick stocks.

However, the limitations and the design strongly support in-sample overfitting.
Try designing your system, use backtesting runs only to check for unintended errors, convert it to a live port and check back after the next peak-to-peak market cycle.

Unfortunately we don’t have that kind of time and are generally eager to invest, so we prefer betting on any possible shred of outperformance that we feel confident with.

I believe that there are too many variables in the short-run that make reliable stock predictions impossible.
There are too many macro trends, political turmoil, social and environmental challenges that no one can accurately predict.

I think that if you turn off the noise, investing in the stock market is no different to trading goods a few thousand years ago.
It is a game of having more or better information or taking advantage of the ‘animal spirit’ (Keynes).
Access to privileged information still seems to have a slight positive effect, but more striking might be to exercise patience and persistence in your core strategy.
Daniel Kahneman’s “Thinking, Fast and Slow” provides an excellent recap on the findings in psychology of the last decades.
I think a basic understanding on human nature helps us to make better choices both as a personal investor (for example thinking about ‘framing’ in terms of the data presented here) and in building models (know why strategies work and not run away from them if they underperform for a few years).

Of the R2Gs that have been launched for at least one year only a little more than half have beaten the benchmark - keeping in mind that most benchmarks might not be appropriate and that this includes only performance in a generally friendly market environment. Risk-adjusted Sharpe or Sortino-ratios over an out-of-sample peak-to-peak market cycle might come to a different conclusion.

That said, and without finger pointing and by including myself, I think that most of what we observe (in terms of evaluating R2G performance) is sheer chance.

Randy, I might want to borrow your monkey for now after he has thrown your darts.

Best,
fips

rallan - I “played” with Neural Nets for several years back in the early 1990s, specifically for the financial markets. By the end of the '90s the entire financial industry had pretty much given up on them simply because there is no visibility into the underlying (abstract) “rules”. I’m sure there are people such as Vanstone who try to breathe life back into this area but I believe that it is pretty much dead for the reason I mentioned above.

I have not read Vanstone’s thesis and I don’t plan to, the reason is personal (has to do with academics in general). However, demonstrating on “test sets” is not what I consider to be evidence that something is a valid approach. Test sets are generally small and are subject to the same biases as any optimization technique.

When I originally became attracted to Neural Nets, it was due to the promise of “artificial intelligence”, that I could feed the net a bunch of technical analysis indicators and the net would discover relationships/patterns etc and make intelligent predictions for the future, all with a keystroke. But this of course is delusion.

It was you who brought up neural nets. Anyway, quantitative analysis, as far as choosing inputs by test is concerned, tends to follow the same sort of decay with time. The exception of course are those factors that are based on mathematics directly tied to the company’s books. They should work forever presumably.

As I stated earlier, P123 ranking factors are not the equivalent of NN “inner layer nodes”. Ranking factors are not modified by feedback systems. The are the equivalent of NN “inputs”. So long as the ranking factors are “valid” and have some predictive value then there is no restriction on the quantity of factors used. In fact the more unique inputs the better.

P123 buy/sell rules are also not the equivalent of NN inner layer nodes. If they were then the number of buy/sell rules would bear some relationship to the number of ranking factors. 2:1 inputs:nodes tends to be optimal for financial NNs. I am however a minimalist when it comes to buy/sell rules. Zero is optimum.

Marc G. can correct me if I am mistaken, but I believe he said that there is a short period of time in 2013 (6 months?) that is relevant for backtest. i.e. the period of rising interest rates. The rest of the 14 years of data, although interesting, is not particularly useful.

Others, design models for optimal results over 14 years of data, previous 5 years of data, through two bear markets, whatever. This does not mean that such optimization is wrong, nor should the entire 14 years of backtest data be considered representative of how the model’s future performance should look. Backtest is a tool, nothing more.

As to why P123 provides in-sample data as it now does has been deliberated since the start of R2G. I have been pretty vocal about this issue since the start. In-sample is best case, there isn’t a soul out there who isn’t over-optimizing except for Marc G. (and possibly SZ.) As Marc works for P123, he is providing free models, without the pressures that the rest of us face, trying to “outdo” the rest with in-sample data.

Steve

It doesn’t provide a lot of useful information on what future results should be. As it is currently in R2G, it’s a marketing tool, which is fine.

Backtests are useful for a designer who gets the whole picture by running dozen of them during the development process.

Judging a strategy based on a single backtest is like judging a girl (or boy because this is 2014) based on a single picture. It’s risky. That one picture makes for a very partial representation.
Yet can we blame designers for showing to the world their best backtest?

Anyhow, I think the consensus is here among the R2G users that backtests should not be presented at all. I think strategies would market themselves just fine based on post-launch performance only.
P123, R2G users and designers have everything to gain from this. It would make us look more professionnal for one thing and avoid the influx of clients towards strategies who have yet to prove their worth out-of-sample.
I wish it would eventually happen.

With all backtests/sims, it’s important to understand what was going on in the world/market at different times and recognize which sub-sets of the test match best/worst with your expectations going forward. Although interest rates remain very low, I think at this time we do need to think about how things might look for our strategies in a rising interest-rate environment. Unfortunately, life has been stingy in terms of giving us good sample periods. Mid 2013 may be the best we have. So it’s an important one to consider, especially for income models (for these mid 2013 might be the only useful test period).

But we need to be aware of other issues, too.

For example, I want no part of 1999-2002 testing because that period, I believe, was unique given the nature of the dot-com bubble and crash. In fact, inclusion of it in tests poses great danger of trapping us into kidding ourselves. One of the oddities of that period was the way a narrow group of stocks collapsed spectacularly (and pulled market cap weighted indexes down with them) while many of the huge umber of stocks outside that group rose or fell just modestly. So many non-dot-com p123 strategies build up a lot of alpha during that period. But more typically, and as we’ve seen with later downturns, the bear is more inclusive (in fact, if anything, correlations seem to be higher now than in the past). So pre-2003 test results, while making for a better feel-good experience than fine wine, serve mainly to distort the merits of the strategy.

I also “forgive” models for just about anything that happened in the 2008 crash. While we ca easily apply 20-20 hindsight to build market timing rules that allow us to fantasize about our ability to avoid large “drawdowns,” the reality is that absent hindsight, there’s little we can do to protect ourselves from epic financial meltdowns where the only truly useful fundamental test would be one that could answer the questions: “Who owns the stock and how desperate are they to raise cash?” Ironically, to the extent that stocks weren’t all pretty much in lockstep in 2008, what variation we did see (very little variation) caused better stocks to underperform because those were the ones for which spinning-out-of-control funds could get legitimate bids.

I know I’m in an awkward position when I talk against over-optimizing; as Steve says, I’m with p123 so the only models I put up have been freebies to help launch the site and,one could say, don’t need to attract subscribers. But my situation aside, Father Time is the penultimate judge/evaluator. And of you can’t generate out-of-sample performance, Father Time is going to turn thumbs down. R2G had a charmed life early on, when there was little out-of-sample data to look at. Those days are gone and are no more likely to return than might be dial-up AOL’s stature as king of the internet.

1.)Marc what interest rates are you referring too? Perhaps the period mid 2012 to end of 2013 when long bond (20-yr) yields went from 2.1% to 3.6% should be considered a period of rising interest rates. So perhaps one should look at R2G performance over this time.

2.)Assuming that all market timing rules are nonsense then what have we got left? The only stocks that will be less punished in a down-market are minimum volatility stocks and dividend paying stocks. I am running out-of sample models since July 2014 using as universe the holdings of the minimum volatility ETF USMV. Results so far are very good. You can read about the method here: http://www.advisorperspectives.com/dshort/guest/Georg-Vrba-140627-Minimum-Volatility-Stocks.php and a follow up report on the Trader model here: Minimum Volatility Stocks: iM’s Best12(USMV)-Trader | iMarketSignals
Here is the out-of-sample up-to-date (12-8-2014) chart of the Trader model:http://imarketsignals.com/wp-content/uploads/2014/12/Fig-7.1.USMVtrade-12-9-2014.png. To the best of my knowledge there is not a single R2G model that matches these returns.

So my philosophy is simple. Why would I be able to select a better minimum volatility stock universe than the professionals at iShares? I simply replace the universe of my Best12(USMV)-Trader model every 3 month to make it current with the stock-holdings of USMV.

However, these models can not be offered as R2Gs because they do not comply with current rules. So perhaps there is something wrong with rules that require long backtests but no out-of-sample performance data. I will report back in 6 months again when the out of sample period for the Trader is 1 year long.

Georg

Georg - I like your idea but couldn’t pass up the opportunity for self-promotion. There is one system (at least) with better record.
Steve


Nice stats for both models, Georg’s looks smoother, Steve’s has the higher absolute return.

However, there are a few models that match that return.
But I think it’s careless to compare based on a six month period and by the absolute return only.
We would need the oos Sharpes, Sortinos and longer oos timeframes.

The low volatility anomaly has been around for some time. It’s one of many helpful anomalies.

Georg, why shouldn’t you be able to construct a better minimum volatility stock universe than iShares? They have more resources and suck up more Ivy League students than you probably have at your disposal, but they are mostly prone to the same biases and maybe face different limitations.
One of the reasons I work with P123 is that I want to know what’s inside the box. It would be nice nonetheless to know what they are screening for.

As Marc says, Father time will teach us.

Best,
fips

Steve, your Micro-Cap USA looks great. But very few people can actually trade this model and benefit from it because of liquidity constraints.

Fips, I agree that the oos period for the USMV-Trader is still too short. I am doing an experiment and am not claiming that the Trader will always outperform the S&P500. But remember that it only trades large-caps, and has no liquidity problems. So perhaps one must compare its performance with other large cap models. Also the USMV ETF follows the MSCI USA Minimum Volatility Index which is not described accurately of how it is determined, and therefore cannot be replicated. You can buy the screening parameters from MSCI if you have the money.

Best,
Georg

Georg,
You might check on how R2G prices are calculated. I think, unlike private ports, they are adjusted by a formula for Monday’s prices.

Marco posted on March 31, 2014:

“The price for the transaction will be today’s (Hi + Lo + 2*Close)/4 +or- slippage. The slippage is calculated as a per share amount using the variable slippage algorithm, and it’s either added to the price for buys&covers, or subtracted form the price for sells&shorts.”

Geov,

Your idea of creating systems based on Smart Universes that are pulled from well designed ETF’s is very worthwhile. There are many ETF’s that do things that P123 simply can’t. Complex factor based minimum variance optimization with a host of underlying constraints based on data from hundreds of sources, is clearly not something P123 can (or likely ever will be able to) replicate in terms of universe creation.

I never bothered with these on my own, because they are too much work to execute and test, and there is so much P123 can do, so I’ve stuck with rules based universes, of which there are an infinite number.

But, I’d love it if you could share the ‘backtest list universe’ for P123, so I can build some systems for personal use on it.

Fips, Geov may be able to create a ‘minimum variance’ universe using beta and financial quality and defensive sectors, but it will be a very different universe (and process) from the one that MSCI / Barra are using in this optimization process. They have very different data sets (hundreds of providers)… and a very different optimization process. Sampling 20-30 stocks based on a fundamental ranking, from the minimum variance universe they have created (with decades of risk management and optimization under their belts and large nations as clients), is a very valid approach for all P123 traders. And it’s very smart. There are many ETF’s that use underlying universe creation methodologies we can’t look at (for example 13F replication of hedge funds with longest holding periods and ‘clonability’ index).

As to whether or not these should be R2G’s, I am not wading into that debate. But, I’d love the historical backtest ‘uni’ P123 exposure list to play with.

Best,
Tom

Geov,

The specific index you are tracking is minimizing volatility at the total universe level (while maintaining the underlying index sector, industry, mkt cap and country weights (and a host of other replication issues) typically). What you are doing with just 12 stocks is VERY DIFFERENT, and is very unlikely to turn out a min. variance product. You are taking concentrated company, sector and country bets (very concentrated). You would likely need a very long backtest history to draw any inferences as to whether / how it works. Even 3-5 years of out of sample results likely won’t tell you very much.

But… I’d still love to play with the data.

Best,
Tom

Tom,
What you say about the MSCI methodology is absolutely correct. Also they re-optimize the index twice a year, So the selection parameters are not static.
Country is only USA and I don’t agree that the 12 stocks are concentrated bets. The buy rules prevent that from happening:

Sector Weight <30%,
and Industry Weight <20%,
and exclude some of the largest market cap stocks from being selected.

Also if you read my original article I intend to run 4 quarterly displaced models which hold their initial position for 1 year min. 2 of them are already up and the third will be posted on Jan-2. Thus there will eventually be about 30 different stocks in the holdings of the four models. Using four displaced models provides the ability to stage one’s investments over a year with trades occurring approximately every 3 months thereafter. Universe gets replaced every 3 months as well to stay current with USMV. So by July 2015 I will have about 20% of all the USMV holdings in my models. Even if my method only produces a few percentage points better returns than USMV over time, it will be worth the effort.

I will email the universe to you so you can play with it.
Best,
Georg

Tom I have just finished another model using the holdings of the $23-billion Vanguard Dividend Growth Fund VDIGX. There are only 50 stocks in the holdings and I use 10 of them in a Trader model, but this is not oos, just a backtest.
http://imarketsignals.com/2014/trading-the-dividend-growth-stocks-of-the-vanguard-dividend-growth-fund-simulated-performance-of-ims-best10vdigx/

I think there are many candidate ETFs and mutual funds where one can use this method.
Best,
Georg