Following up on this thread , please find the revised redesing of the R2G model pages. It should address most of the comments that were made. As far as how we will generate the rolling tests statistics w/o market timing I will post something about that soon. Please note that we will release the “rolling tests” tool before the R2G re-do in case we discover issues with the tests.
Could we find simpler English titles for the pages? The language (post-launch results etc…) is convoluted for a newbie.
Why test liquidity back to 1999? The liquidity of the market now has nothing to do with what it was in 1999. Could you calculate liquidity based on the last 5 years, for example?
I’ve found that microcaps liquidity calculations are out of whack with actual holdings liquidity. In a big way. I am not alone. I remember Quantonomics and other making this point.
Do you need two decimals for sector and mktcap allocation? 17% would be just fine instead of 17.08%
How do you plan to harmonize benchmark settings for R2G? Excess perf, alpha, beta all depend on the benchmark, so if you use these metrics to compare models, it’s important that the benchmarks be the same.
Good question about a universal benchmark. I know that some of my models correlate better to IWM than SPY but I figure that if a model cannot beat SPY over the long term than there is no reason to use it. So should SPY be the bench mark for all?
I would like to make some suggestions regarding the benchmark:
(1) The benchmark should be the equal-weighted performance of the custom universe. I believe this is the most logical way of determining model performance relative to a benchmark as the stocks chosen by the model are from the custom universe.
(2) At the R2G top level, where sorts and filtering occurs, models should be grouped based on the correlation between benchmarks as defined above. There should be a mathematical way of automatically grouping the models and this would replace foundation, investor, etc.
(3) New benchmarks need to be introduced as there are market neutral models now. Perhaps ST Bonds ETF or LT Bonds ETF.
For stock R2Gs the equal-weighted performance of the custom universe is best depending on the amount of programming power needed to figure it out. This is true even though in some models the effective universe is actually smaller than the custom universe due to buy rules.
But for ETF R2Gs which are generally picking from a limited pres-selected list of ETFs no matter what the universe is, this may not be the best way to go.
ETFs are great for diversifying and or switching between asset classes which is an entirely different type of benchmark. Some appropriate benchmarks for ETF R2Gs and sims should include:
The aggregate bond market (i.e. AGG).
Treasuries of various durations from SHY (cash) all the way up to EDV (decades of duration).
All world index both in USD and currency hedged versions.
All world ex-US both in USD and currency hedged versions.
60/40 Stocks/Bonds.
The “Permanent Portfolio” which is 25% each: Stocks, Bonds, Gold, Cash.
The Permanent Portfolio plus a few more alternative assets such as REITs, Foreign Stocks and Commodities.
I have a question about how model revisions will be handled - will the pre-launch statistics always refer to the initial model or to the revised model?
I think it would be more helpful to refer to the revised model, this is presumably now possible as you don’t have to “unhack” two simulations as discussed in the past.
Some additional thoughts:
Will the new stats like the rolling tests, end value dispersion and existing ones like bottom 20% liquidity be available in normal simulations as well? These are interesting and an automated process would be appreciated.
The charts of performances, should be log, with the lowest value in the Y axis equal to the lowest value in the chart + say 1 point so that it is log to have a better idea and it does not waste space below, compacting the chart. this should apply to sim charts too of course.
Perhaps the benchmark for ETFs should be the average performance of all the stock models. The logic being that the ETF model should outperform the average stock fund.
DISCLAIMER: All my ideas tend to be wild and may not have practical implementation so take with the usual grain of salt.
It appears that some of the Trading Stats (Avg Return Winners; Avg Return Losers; Winning %) have disappeared from the revised format. Perhaps someone will educate me as to why they are not useful, but I use that data to calculate Profit/Loss Ratio and plot the data on a breakeven expectancy curve.
Having the data for pre-launch and post-launch would be helpful to compare the two and see how much the model is moving on the plot as more OOS data is collected.
I review R2G models that are further away from the breakeven expectancy line and in a zone where winning % and profit/loss ratio is comfortable for me personally (not knowing the model designer & underlying model rules makes it psychologically more difficult for me to accept a lower winning % with a high profit/loss ratio unless it’s proven with OOS data over different market conditions).
I expect the OOS plot to be different from the pre-launch plot because of market conditions, less # of closed trades, etc. But if a model that has a pre-launch plot far away from the breakeven expectancy line starts plotting in the negative expectancy area OOS (taking into account that sufficient closed trades have occured OOS), then that raises a red flag to me that I need to look at whether this model is worthy to be used in my portfolio.
In your approach to turn off market timing for the 100 moving averages, how are you going to handle the use of Flip-Flop rules? Most of them are also a method of timing, but can be a major part of the reasoning behind a developers design of a trading system.
It is well documented that the stocks that have had the greatest gains prior to a recession are the same ones that fall the fastest and lose the most during a recession. A well designed trading system that features these stocks might recognize this and be designed to move to cash or an inverse fund during recessions to protect capital.
By removing market timing for all the moving averages comparisons it causes a severe disadvantage for a designer that properly use timing to protect capital. (what is a “proper” use of timing is another issue) Most average investors (not pros) I have talked to would prefer to be out of the market during recessions. They may get upset when they miss 5% gains due to whipsaws, but they are extremely fearful of anything over 30% losses during recessions. Since average investors are who R2G was created for, their desires need to be accommodated.
I propose that 2 series of moving averages are run. One for without timing and one with timing. I feel that is the only “fair” way to treat systems that use timing.
Actually, investors (pros and individuals) prefer to be out of stocks that go down and/or trail benchmarks. Big losses during recessions are obviously a subset of that.
The issue here, however, isn’t what investors want. We know and agree on that. The issue is what is actually delivered in real time; not via sim. As I noted in another post, if you believe you have something that delivers this, we’ll discuss it off line.
In terms of how the information is displayed, I would suggest to compile it all in one layout instead of having to click on the three separate tabs ‘post-launch’, ‘rolling tests’ and 'pre-launch.
I agree with aurelaurel, Denny, iavanti on the following points:
Like iavanti said, use log charts. the reasoning here is that you want to constantly compare to benchmark and not have 1 year take the chart to the stratosphere making comparisons more difficult visually. Also, I would like to see 5-yr liquidity statistics.
Like Denny said we should have the tests shown with and without timing. However I would add that I would like to see a “timing-less” simulation chart in the simulation page along with the normal simulation chart. This would give a decent idea of “intrinsic” risk in case the timing does not work during a future correction.
Here is something I believe would be very useful to have: INCLUDE WEEKLY RETURNS DISTRIBUTION WITH AND WITHOUT TIMING!! Can include monthly and yearly as well. Perhaps this could be used instead of showing the timing-less chart in case adding the chart does not happen. In my opinion this this an important set of information prospective subscribers can look at and it is easy to understand it and its importance.
4)Like aurelaurel said, keep the end-user in mind! I think its fair to say most of us understand this, but do all prospective users? I do not think so… Show all on two or less pages to simplify a little. We do not want a less friendly or confusing site! The way I would do it is to just have 2 tabs one for results (live and sim each with its own chart) and one for any present and future tests and statistics. Also, include an education section to the site explaining any test you plan on adding. We are already missing enough manuals and videos for new users and even for experienced ones as is and I would not like to see more confused users and questions from my subscribers and peers.
Aesthetically I would like to see more blue and less grey. There is scientific evidence some blue tones might help inspire trust too which is always good if we are trying to generate revenue for us, the site, or our individual causes.
Have the option of the client/user/viewer choose his/her own benchmark! Can have SPX,SPY as default. This addresses all of the benchmark-related requests on this tread and everyone would be happy.
Unlike some of the recent comments, I prefer not to see backtest and OOS results on the same page. Backtest is not performance and the two should not be mixed for the sake of convenience.
Hi Steve. The way I would do it is to just have 2 tabs one for results (live and sim each with its OWN chart) and one for any present and future tests and statistics (statistics also separated between live and sim). No mixing but still convenient.
It seems there are a lot of proposals and wishes here. And must admit all are reasonable in one or another way as useful and valuable for peoples expressed them. That is OK.
A lot proposals are about user interface, also subjective, but here I think we should rely on UI pros.
My proposal would be to focus on functionality (stat back system) and let user manage UI the way they want, like Panels for Stocks (functionally, not to look like now). For sure, there could be default layout based on this thread proposal and further on usage statistics.
As for stat back system, I think this (if servers performance allow) could be calculated olap cube like, using dimensions - this will let to get stats and visualize anyway user want and will fit any wish of this thread.
As for me, would like to be possible to take all the stats to the R2Gs list (should not be displayed by default, but available to display) and get it downloadable in xls.