How many R2G's have INCORRECT PERFORMANCE Stats?

Out of sample performance can only start at the date after a series is corrected, before the fix date OOS performance is not correct.

When “fixes” are made to historic data series performance usually deteriorates, because the model’s algorithm was optimized for the data as it existed when the model was designed. For example my Best(SPY-SH) model’s trading rules use volatility, risk premium, earnings estimates together with moving average cross-overs. Earnings estimates of the S&P500 were recently “fixed” reducing CAGR for the model and may also influence future performance. Here are the risk measurements for the period 1/2/2000 to 8/30/2013 before and after “fix”. The model’s documentation with performance history to 8/30/2013 can be found here: http://imarketsignals.com/wp-content/uploads/2013/09/spy-sh-r1-tbl3.png

Benchmark SPY return and max draw-down is the same. Standard Deviation for SPY should also be the same, but it is not. P123 must have recently changed method of how Standard Deviation is calculated, but I don’t recall an announcement to this effect.


Best(SPY-SH) before fix.png


Best(SPY-SH) after fix.png

Maybe we need more color coded flags on the equity curve, like the current “designer revision” flag, but for “database revision,” and “factor revision,” at the least for major revisions that heavily impact the equity curves.

I think it’s time for a redesign.

The current implementation is too restrictive because an R2G is one sytem: it’s a simulation that is converted to a live port. This makes rebuilding the simulated part impossible.

We’ll change the R2G’s to be two systems: a live port and a simulation. This way we can show separate stats, easily re-run just the simulated portion, present the data differently, etc.

We’ll have more details soon and get started on this now.

Thank

Thank you very much Marco!

Marco - if you are going to do this then please don’t allow simulation stats to be integrated with OOS as they are now and don’t allow backtest data to be sorted as one would performance data. Otherwise you will have created a simulation paradise where you just put up a new improved simulation when the system performs poorly. We would have “simulation showboating” in perpetuity. Not what you want.

I see nothing wrong with the way mutual funds and ETFs do it, where they create a prospectus with a theoretical index but the data is not integrated with OOS performance. The OOS has to be standalone.

Steve

While we are on this subject of restructuring rules for R2G models I would like to draw attention to the possibility that having a constant selection process for a universe may not be desirable. I was reading how MSCI constructs their USA Minimum Volatility Index. http://www.msci.com/resources/factsheets/MSCI_USA_Min_Vol_Factsheet.pdf

The MSCI Minimum Volatility Indices are designed to provide the lowest return variance for a given covariance matrix of stock returns. Volatility Index is calculated using Barra Optimizer to optimize a given MSCI parent index for the lowest absolute volatility with a certain set of constraints. These constraints help maintain index replicability and investability and include index turnover limits, for example, along with minimum and maximum constituent, sector and/or country weights relative to the parent index. Each Minimum Volatility Index is rebalanced (or is re-optimized) semi-annually in May and November.

That means the parameters for the index are changed twice a year. Hence, backtesting over a 15 year period would imply changing the selection process for the universe 30 times. I have been running a Best12(USMV)-Trader model live on my website iMarketSignals.com since the beginning of July using the stockholdings of USMV as universe and updating the universe every 3 months. So far with great success. The model is up about 22%, whereas SPY only gained 6.7% over this period. I would love to offer this model as a R2G, but it would not qualify under the existing rules. Perhaps P123 could provide a section for specialty R2G models such as this one, where the backtest period is short, but the methodology is spelled out for subscribers.


Geov, brings a point worth reflecting about.

Does R2G aims at competing with Collective 2?

Because there are strategies that we can’t run on P123. Some data is not there or some is too recent. Market timing using options data for example.
There is a market for these, it might as well happen here. But not as long as R2G is marketed as backtested P123 strategies only.

Could a R2G model ever be made of a simple live portfolio where all trades would be entered manually by the author based on his own strategies, manual or automated?

We have “foundation, investor, trader, ETF” sections. Why not an “unregulated” section, that would be constituted of live portfolios with manual inputs. Who knows, half the people from Collective2 might flock in. That is if you stick to the 80-20.

I think an unregulated section would be a good idea. People can then watch the performance for several month before subscribing. Also P123 could charge a fee upfront for unregulated R2Gs, instead of the 20% cut. Having to pay a launching fee would ensure that only designers who are confident of their models would submit them. Currently one can put anything up in the hope of someone subscribing, because it cost nothing.

I’m not sure I understand Geov’s comment about “costing nothing” to put up a system. As it stands now it costs ~$33 per month for six slots whether you use them or not. The problem isn’t that we should pay to launch a system, it is that the competition is fantasy-land. You can pay and pay until you are blue in the face but so long as you are forced to compete with “juiced” simulations then you won’t draw subscribers, unless of course you do the same.

As for Geov’s suggestion of updating the underlying stock universe every six months, this is a great possibility and it can be enabled by cancelling the display of simulations altogether. We are already allowed to update our models every six months. By cancelling sims, then we can create in-lists or custom universes as desired. The criteria would be the statement of the methodology, subscribers can decide for themselves if they are interested.

As it stands right now, many P123 members really believe they will achieve what simulations suggest they will get. This is wrong and needs to change.

Steve

Steve,
I do not see my P123 membership fee as a cost for putting up R2G models. I consider the fee as payment to enable me to do research using the excellent data-base offered by P123 and construct models for my own use, which are not available as R2G. That was after all the original intention of P123 members who joined before R2G came into existence.

There is a lot of merit in re-optimizing a model every 6 months and allowing in-lists. As mentioned in my previous post MSCI does this as well for their index http://www.msci.com/resources/factsheets/MSCI_USA_Min_Vol_Factsheet.pdf. If you look at the link you can see their performance graph starting in 1999. But the period 1999 to 2011 is all a backtest period. iShares USMV ETF only started in October-2011 - so they only have 3 years out-of-sample performance, and have so far under-performed the S&P500, as the minimum volatility stocks should. Obviously the investors who have about $3-billion in the ETF believe the MSCI backtest data, and hope to lose less when the market tanks, than if they were invested in SPY.

Georg

georg - Before R2G came into existence the cost of membership was a lot less $99/month versus $200/month or $300/month. OK - for my first membership I’m paying $25 per slot, after consideration for the original intention. My second and third memberships are strictly for R2G models at $33/month per slot.

By the way, some models are already doing as you wish to do: Hemmerling’s Dividend Growth Portfolio and Hoosthu1’s Master100 work from tailored stock universes. There is nothing wrong with this except for the backtest.

Steve

Georg,

I have models that run on my server, that can’t run on P123. I have considered opening a Collective2 account, but why bother paying $120 every x months without guarantee to have people interested? So I just keep my work to myself and never bothered with Collective2.

If P123 asks for a fixed payment to open a R2G model, I’m out of R2G. And so will many who don’t want to bother. Subscribers come and go but the payment is regular, it makes no sense. Bypassing R2G fees is as easy as renting a server and a domain name ($8/month) and publishing the corresponding portfolio picks against a fee. With Paypal, anyone can do this. The sustainability of R2G depends on fees being reasonable enough that people don’t want to bother with the above and are happy with the added visibility. It should be a win-win.

Lastly why would a P123 past database fix affect future performance? This is impossible. If a model has a predictive value it will make money in the future. They can change the past data however they want. They can even erase the whole database, it doesn’t matter. A market timer that has predictive value isn’t depending on the past.

Please do not waste time on this. As marco mentioned, simulations don’t even change that much. We need other functionality such as properly working short only systems. These could drastically reduce risk when combined across all R2Gs!! We also need instruction manuals for some of the functions we already have but very few know how to use fully…

A simple fix would be to REQUIRE all designers to post this. Why waste p123 resources on this??

I myself think its about time we get a decent update in the manuals and get some videos in. This would also probably increase new customer retention and should be relatively easy to do.

I agree with Edwards (SZ), if updates do not Change much, why rerun?
From what I can see, subscribers do put a lot of weight to real-time Performance and look for a similar behavior as the Simulation part shows. The models that fulfill this have lot subscribers.
And the race is not over, r2gs will have to prove several more years, that they provide alpha. I guess there will be a lot more success and failure as soon market behavior changes: how will r2Gs time the market in the next crash, how do they perform when small caps come in to favor again, what about rising interest rates, how will models perform with new data (European stocks), etc. etc. Stars will rise and fall and the best will show robust behavior in all environments. What I would love is that subs could look up, what systems have been deleted by creators and how they perform now; the longer the r2g thing runs, the more important this is going to be (but incubation time helps here a lot!)
The Universe that a model uses should be published (not the stocks but the name of it and the fact that it is a custom one created by the creator) and that should be mandatory enforced by p123 (the creator should not have a chance “to hide this”), though through incubation time this is almost healed.
I am not a fan of re-optimizing (e.g. use revisions for this), but this might be because I just do not know how to do it correctly. I rather stick to phenomenon’s that are rather robust since centuries (Value, Momentum, Size (Small Caps), Earnings Momentum). But from what I learned of a successful friend, this re-optimizing should be possible to do with success.
Just my two cent…
Regards Andreas

Marco, where are we with this? I believe Geov’s models have been effected by P123 data fixes. He created a “SPY-SH revised 12-7-14” model in efforts of showing disclosure of how his original model was effected. But his original revised model and this newly posted revised model do not show the same performance since doing so. They appear to be on opposite sides of the trade. I have asked him to look into this but have not heard back. What is making these two models not match since the revisions were designed for them to mirror each other going forward?

It is my understanding he needs to remove the revised 12-7-14 model soon so that he can then show how the P123 data fixes effected his SSO-TLT and other models as he has limited amount of R2G slots to show all this disclosure. Again, I think it should be P123 providing updated “correct” numbers anytime a P123 data fix is made and would hope you are currently working on that. In the meantime could you give designers more R2G slots as long as they are used for showing the “revised” models due to P123 fixes? We need to be able to see how combos would look with the new corrected data therefore need to do that with the revised r2g’s instead of the ones that have locked historical data that may not be correct.

How do we know which models have been or have not been effected by P123 data fixes?

Thank You

RJJ, We should wait until today’s rebalancing to see whether the two SPY-SH models will be in sync again. I think the revision was not picked up last week but should now be effective.

I agree, P123 should provide additional model slots so that each R2G model can be re-calculated to show the performance, etc. as if the model was launched now. Then one would be able to see how P123 data and algo revisions have affected the models.

Redesign is in progress , consisting of:

  • splitting simulated and out of sample results
  • rerun of simulated results
  • front end redesign

Should take a 2-3 weeks.

Maybe it’s too much to ask for on a Sunday, especially if we already receive this update as a New Year’s present.

But what will the update look like - like one model page where you can switch between simulated and oos results?
And what is meant by “front end redesign”? A different presentation of the models?

Best,
fips

We’re re-doing in a way that will hopefully addresses all the criticism . I mean that in a good way.

Thank you for clarifying that :slight_smile:

We have all worked hard for what we have. Like yourself we want to preserve and attempt to grow our resources. We rely on accurate data to make the best informed decisions we can. We appreciate you making fixes as needed for us to have a better chance at success.

Again, hopefully what you are working on will also allow us to run a combo book simulation based on the updated correct historical data after revisions.

Thank you for the update and timeline you are shooting for. Very much appreciated.