R2G DESIGNERS - READ THIS!!!!!!!!!!!!!

Adviser Perspectives enabled me to reply in their AP Viewpoint area, which I did:

http://www.apviewpoint.com/component/kunena/?func=view&catid=94&id=8888&Itemid=0#8888

I’ll also move traffic there from hvst.com, which has a serious professional audience, and will probably do a summary version on Forbes.

re: High Liquidity Models
This (SPY-Cash) with Hi-Lo Index model switches between SPY and cash.
Liquidity (bottom 20%) according to the R2G conversion routine is $1,529,902,720.

The timing indicator is derived from a custom series which is the daily indicator of the ratio of the number of S&P500 stocks that have reached new 3-month-highs minus those that have reached new 3-month-lows, divided 500. There is only one variable in the model. It is therefore highly unlikely to fail in the future.

Using the P123 formula the maximum port size is about $3-Billion. That should be enough!
BTW this model is currently in cash.



Marc,

The link takes me to a sign-up page for advisers.

Thanks.

Yes, so i see. I checked with them about exclusivity and they ay they are fine with me reproducing elsewhere with acknowledgement, but as a matter of courtesy, I’m going to hold off for a few days to give the author an opportunity to reply. When appropriate, I’ll provide a link tro a site you’ll be able to access.

Why do you believe that using only one variable would lead to a higher likelihood of success?

I didn’t follow this part of the thread so I have no idea what the issue is. But when I see that phrase, I instinctively go “Oh sh**!”

Forget finance. Forget statistics. Forget economics. Remember superstition!!! That’s bigger whammy than the Sports Illustrated cover jinx. :slight_smile:

It is common knowledge that the more variables a model has the less robust it becomes.

Georg,

That is not explicatively true. My 4 best performing ranking systems were created in 2009 by modifying existing older systems. I combined the best 9 systems at the time into 4 new systems and added / subtracted some factors to take advantage of information learned from the last recession. These 4 systems contain from 56 factors to 112 factors. The performance of these 4 systems with 6 years of out-of-sample performance has shown to be more robust than any other public system I can find.

Denny, I am glad that your models worked well for you. But you have only used them during up-market conditions from 2009 onward.

Robustness is the ability of a financial trading system to remain effective under different market conditions. Using an excessive number of parameters may induce overfitting. A model that has been overfitted will generally have poor predictive (out-of-sample) performance, as it can exaggerate minor fluctuations in the data. So I still think that models with fewer parameters that can be varied are more robust than those with many parameters.

Gosh Georg,

6 years out of sample performance since 2009 (that is similar in performance to the Sim) is not enough to show robustness? Is it really necessary to experience a recession to verify robustness? How long do you suggest that we watch your R2G Ports for robustness before we subscribe? :smiley:

I do understand that there is academic studies that “show” fewer factors are more robust, I contend that carefully constructed systems with many independent factors have a good probability to be more robust than much simpler systems. Especially if they are designed using EvenID = 1 stocks, and then are tested against totally out of sample EvenID = 0 stocks.

That quote is correct, I have been watching it for over a year, however, it did lose 7.9% in the Oct. 17, 2014 > Dec.5, 2014 while your baseline was up 10.3%. Is that still robust?

I am not sure if this is common knowledge or a true statement for that matter. It depends on how the ranking system was constructed in the first place, among many other considerations.

My first impression was disappointment that the thread seemed to be drifting off topic, but on reflection, this robustness question remains very relevant. My short answer is that I don’t have enough of a statistics background to comment on whether it’s true that a reduction in the number of factors increases robustness, but if it is true who cares. Robustness looks to me like it may be the right answer to the wrong question.

I can’t believe I’m going to do this, but I’m actually going to cite the Adviser Perspectives article I rebutted, a part with which I actually agreed and for which I gave the author props:

“All of this rests on the assumption that relationships between factors and rates of return – if they exist – are persistent. That is, that they persist from one time period to another. This assumption is warranted in almost all scientific fields. If a result of a combination of physical forces is convincingly discovered in experiments, it will work in the future too. If a vaccine is developed that prevents a disease for test subjects, it will work even if everybody uses it.”

He goes a but astray in the next paragraph:

“These assumptions are completely unwarranted in the investment field. If an investment strategy that beats the market is discovered and verified – even if it is not a spurious discovery – it will not work for everybody. It cannot; it is tautologically obvious that not everybody can beat the market. Hence, a strategy that is identified as effective, correctly or not, must, at some point, if it becomes popular and widely adopted, stop working and may even reverse to become a bad strategy.”

Yeah, a factor being arbitraged out is something about which we must worry and is why we always need to stay on top of what we do (and a reason why I test with purposive sampling, rather than simply hitting the “Max” link). Frankly, though anybody who loses sleep over the prospect of “everybody” using a strategy needs to get out of the house more often.

What the author missed is that in scientific research, the findings are expected to be applied to the same population as that from which the test samples were drawn. So a vaccine successfully tested on research subjects can be expected to work for the population (subject to discovered and disclosed exceptions).

That’s not us. We can ONLY apply our findings OUTSIDE the population from which we test. The findings we make using the 1/2/99-8/27/15 population and/or any samples drawn therefrom cannot be applied to any part of the population. They can only be applied to the indeterminate population 8/28/15 – Whenever. So who cares how robust the test results are? If the 8/28/15-plus population differs in relevant respects, the model remains useless no matter how many mathematicians and statisticians would praise it.

To jump out of and beyond the research population and into the application-population, we need reasons why we think that will be do-able. We can never be sure (until it’s too late). But we have a big body of theoretical knowledge upon which we can rely to enhance our probabilities, and when we deal, as we must, with the 8/28/15-plus population, probabilities is all we can get.

In our context, minimizing the number of factors can be disastrous.

The single biggest problem is that it leaves us exposed to model mis-specification. This is a huge issue and may be the single biggest reasons why a model might fail even if the future population resembles a historic population against which a robust test was done.

There’s also the nature of what we’re tying to do. Stocks do what they do for a multitude of reasons and there are no brownie points awarded to those who resist the temptation to look at (uncorrelated but similarly potent) factors 2-5 and sticking with robust factor 1 out of statistical principle. Sooner or later, you’ll hurt yourself if you reduce the number of ways you can succeed and eventually, that will make you (and your R2G subscribers) very unhappy.

Marc, your comments are good, but we all know that a backtest can be optimized by adding more and more parameters until the desired high return is achieved. In this respect the Akaike Information Criterion provides a measure of robustness of a model. it is as a criterion that seeks a model that has a good fit to the truth but few parameters.
http://brianomeara.info/tutorials/aic/

If someone is going to do the wrong thing, they’re going to do the wrong thing, and eventually, Mr. Market and the court of public/client opinion will render verdict and sentence.

I’m not a math expert and struggle to understand what was on that page. But for what it;s worth, I didn’t see anything that addressed the type of data challenges with which we deal. It seemed, for instance to reflect an unstated presumption that you can count on the data meaning what you think it means. What if you can’t (an everyday problem for us)? How do they advise guarding against the rest of model mis-specification to which we’re so prone?

We need to be very careful about applying protocol from outside of finance. We can never presume appropriateness. It has to be considered and established each time.

Irony of ironies, the author of the Adviser Perspectives article replied and we reached agreement pretty quickly. He softened from what had been an overly broad assumption re: the representativeness of the practices he described, and we reached agreement on the big questions. Of particular interest now, is the following from his reply (bearing in mind this doesn’t come from a math dunce like me but from one who is a mathematician and economist a visiting fellow with the Centre for Systems Informatics Engineering at City University of Hong Kong, a principal and chief strategist of Compendium Finance and a research associate at EDHEC-Risk Institute. His co-author is Head of the Systems Engineering and Engineering Management department and Chair Professor of Industrial Engineering at City University of Hong Kong.

“In short, the only way you can tell whether a historical relationship among investment factors is likely to continue in the future – no matter how robust or how much the relationship has been replicated ‘out of sample’ – is to have a solid theory.”

This is the answer. This is the way to protect against bad models, not artificially constraining the number of factors and exposing your work to a host of other problems. If you want to refrain from drunk driving, then don’t drink and drive. It’s not necessary to give up your car and discontinue your driver’s license.

Totally agree with this.

what exactly do you mean by “parameter” in the context of a P123 model? I would accept that adding buy rules such as “Do not buy Enron if year equals 2001” would probably make a model less robust in the future. But adding a sensible sounding factor to the ranking system wouldn’t be an problem.

In the context of p123, I suppose parameter can be used to refer to a buy rule, a sell rule and/or a rank factor. I’m guessing most of those who speak on forums are thinking in terms of rank factors. To me, though, buy rules are extremely important, maybe even more important. I would not describe a buy rule “Do not buy Enron if year equals 2001” as making a model less robust. You’re so much nicer than I am. I’d describe the model using the F-word: fraud.

I have no idea if use of buy rules, however many one thinks are needed, makes a model less “robust,” but if it does, then robustness would need to be abandoned as a goal. Conditioning the universe is likely to be far more potent than anything that can be done with a sort, unless the ranking system is very broad. The classic example of the interaction between screening and ranking is Piotroski. The F-Score is not supposed to be run against a universe as a basis for sorting. The idea was to find a way to avoid the presumption-of-dog that initially attaches to stocks with high BM (same as low PB) ratios.

I agree!

Value → I got a theory + empirical tested by a lot of scientists
Size → I got a theory + empirical testet
Momentum → red theorys about it, not convincing though, I do not understand the theory, so I go back to empirical tests
Earnings Momentum (Market Level for Timing and stock Level for ranking) – I got a theory

combined in one model → also empirical tested and even better results.

Regards

Andreas

So 3:1

as I understand it, Georg’s motivation in seeking a “robust” system is to avoid curve fitting. I suppose the thinking is that adding more parameters gives the designer more wiggle room to fit the curve, which must make it a bad thing.

Before adding a ranking system factor, I prefer to ask myself if I could plausibly have thought of it in advance. As far as the number of factors is concerned,
I would have thought that a small number of factors leads to a high risk that none of the factors I choose will continue to outperform. Too many factors would
probably dilute performance. I’m not too sure what the ideal number might be.

Not sure if i’m interpreting correctly. If you’re refraining from use of momentum because you aren’t comfortable with theoretical material, that’s fine If you’re doing going empirical-only in the absence of theoretical understanding, you may get lucky, or your luck could run out.

If you want to read more on momentum. Google “Clifford Asness.”