2 Bad Years: Proven Statistically

InspectorSector · November 9, 2019, 1:11am

So I am a little confused here. I know that is my normal state of mind so please spare me the comebacks

Back in May Marco indicated the following https://www.portfolio123.com/mvnforum/viewthread_thread,11763_offset,40

What’s at stake here is the SP 500, 400, 600 and 1500 index universes and their historical data. They are currently included with our S&P data agreement and go back to at least 1999. The constituent data will disappear around August from our data service. Their index data will be marketed separately and they are asking around 20% of our current costs for it. Of course there are a lot more indexes than we need, but they don’t care that we do not use it.

If we do not accept their terms …

We will not try to re-create them using rules.

We will have a backup using ETF constituents data providers. Since the SP 500,400,600,1500 all have ETFs we should not loose any functionality. The only loss will be historical data that goes back to 2007 for the SP500 (not sure about the others yet) and maybe a little bit of “precision” as far as when stocks enter/exit the index.

This one is $240/mo Constituent List Subscription Information1

That sounds good to me and we will not have to raise prices! It will also give us something we do not have: ETF constituent data for all US etfs which we may want to use for other ideas.

This was followed up by the announcement in August https://www.portfolio123.com/mvnforum/viewthread_thread,11886

Dear All,

S&P Dow Jones Indices (a “separate” division of S&P) will require licenses from anyone using or redistributing S&P universe beginning September 1. Since we are a redistributor the licensing costs are quite high. Therefore we will make these changes effective after the market close today:

The universes for the SP500, 400, 600 & 1500 will be renamed to clearly indicate they no longer come directly from S&P

The current and historical constituents will be sourced from monthly iShares ETF constituent data (end of the month)

No other universe should be affected

Updating the constituents monthly should only cause minor tracking errors and cause minimal differences in your simulations. Enclosed please find a spreadsheet with a weekly screen backtest comparing S&P’s SP500 and one sourced from monthly ETF data.

Now Yuval and others are telling me that ETF constituent data is “sparse” and P123 doesn’t have it. OK, perhaps it is none of my business how the S&P indices are now being generated or where the constituent data is coming from. It is possible that I errantly came to the conclusion that P123 went ahead and subscribed to the aforementioned data service. But I’ve gotta tell you that this is spoiling my prime activity in life which is to petition for as much new functionality as possible.

It would be nice to know what the current situation is regarding indices and ETF constituent data (for 1500 ETFs, not just the S&P ETFs). Also, what the future plan is once the transition to FactSet. If this is a sensitive area then OK I can live with that answer.

Thanks
Steve

mgerstein · November 11, 2019, 2:09pm

The traditional way of getting S&P 500 constituents was for Marco to periodically remind me that we haven’t update in a while and for me to go to S&P’s web site and hunt for changes index constituents. That was for S&P 500 only. The others were added after we became an S&P licensee, and index constituent data was included in p123’s license. But S&P, in its corporate wisdom, decided to separate index constituent licensing into a separate profit center and make users (not just p123; all users) pay separately.

P123’s needs are sufficiently modest to allow for it to download holding for the ETFs it needs as per its chosen frequency. Accordingly, P123 declined to purchase a newly-created separate license for Index constituent data. And based on the way S&P started reducing the asking price, I suspect many many others also declined.

There is nothing sketchy about this. All holdings for all ETFs are publicly disclosed daily; typically via HTML and Excel download for the fund’s web site.

For those who model based on ETF constituents and who, therefore, need a large volume of data, it’s more convenient to license from a data provider who collects the info and, essentially, sells convenient access. That’s what gets sold; the information itself (just like everything else we use that comes from 10-Ks, 10-Qs etc) is freely available.

Like S&P, FactSet also licenses ETF constituents, and in theory, we could license all constituents for all ETFs in an automated way. But although the information if free, convenient automated access is not. We do not at this time sense enough interest on the part of enough users to justify the additional cost.

InspectorSector · November 11, 2019, 2:58pm

This is all fine and dandy but is just dancing around the issue. You can get current S&P constituents from the ETF provider, but I don’t think you can get historical info. I won’t question where you get the historical data from because that is where it gets sketchy.

My issue is that Marco had identified a source for $240 / month that provides all historical constituent data for the majority of ETFs and I was under the impression that this was the “plan”, to use this alternate supplier’s data. If that is NOT the case, then so be it. I’m not pressuring or questioning anyone, I am just saying that it is quite peculiar that no one even recognizes that route now. Yuval says that historical data is “sparse”. Even you Marc, refer to the S&P data, not to this alternate supplier of ETF constituents.

Anyways, I’m not trying to make waves, I just don’t understand why the change of direction in what seemed to be a pretty clear path forward.

Steve

yuvaltaylor · November 11, 2019, 4:25pm

It’s a question of priorities, Steve. We’re working on a whole host of product developments, and implementing the switch to FactSet and offering global data has to take priority over adding ETF constituents to our system, which involves a lot of programming work on both the front end and the back end. Regarding the data being sparse, there’s no way that I can see to get any ETF historical data prior to 2009, even with that source that Marco identified. In the meantime, we’ll be shortly releasing a way for you to import your own factors through an API, and if you want to subscribe to a service to get historical ETF component data, you’ll be able to add that data to your systems.

InspectorSector · November 11, 2019, 6:39pm

Yuval - I got 'ya. I was just confused by the disconnect. If somebody had said: “Yes we looked at it but decided in the end to go a different way, or isn’t a priority right now,” I would have understood.

Thanks

Jrinne · November 15, 2019, 11:10am

Hi Florian,

Maybe I am the only one but my ports are more volatile recently than I could have expected from my backtests and early out-of-sample results. In hindsight, I could have been less concentrated and more diverse.

I am considering checking the correlation of each stock I purchase, in my ports, with each of the Sector Spyder ETFs. This can be done on the web page. I would then purchase an equal amount (would not have to be equal) of the least correlated ETF when I purchase a stock in my port.

The good:

reduces individual stock risks
reduces overall market risks
low slippage on ETFs
investing in the S&P 500, by itself, would offer diversification for my small-caps
these low correlation ETFs have a positive expected return (but may underperform the S&P 500).
I cannot short with my SEP IRA. I.E., Pairs-trading and statistical-arbitrage would not be an option even if I knew how to do it.
Inverse ETFs have a negative expectation
Inverse ETFs have volatility-drag.
the inverse correlation of inverse ETFs is not perfect and you can lose money on the ETF and the stocks you hold. My stomach is probably too weak to stand that for long.

The bad:

Likely to be holding a lot of Consumer Staples and Utilities. Boring but probably the best thing to hold when the recession comes. These do underperform the S&P 500 but not as much as I thought. And not as badly as the average Designer Model, as you note (excluding fees even).

Just a thought, FWIW.

In any case, the excellent posts and your Designer Models (which are doing well) are much appreciated.

-Jim

Schm1347 · November 17, 2019, 3:24pm

Jrinne,

Looking at the past two years is shortsighted to say the least. Most strategies on this site have a strong fundamentals basis. We have been in a growth regime for the last 10 years with market gains overweighted to a few high octane, high sentiment growth stocks which take on high debt at low interest rates with delayed profits. At the same time a lot of these stocks don’t have great fundamentals by a historical basis and would be unable to perform in an environment with less altered macro money flow dynamics. This is not typical of historical market activity.

Not to say that a lot of models aren’t flawed but if you are considering a model to be flawed because it underperforms for 2, 5, or 10 years then you are too short sighted. No model is going to reliably beat its benchmark every year without significant optimization. And then in that case it will be more likely to underperform thereafter. There have been several articles stating value investing is dead. Maybe it is but it was in 1999 too. And stock investing was dead in the late 1970s too. Keep in mind also that Buffet has $128 billion in cash accumulated. This is not a good environment for fundamental stock picking.

Chasing models that have performed the best the last few years IS chasing performance. Also finding a model that has performed well the last two years probably isn’t going to perform well over most of history.

Jrinne · November 17, 2019, 3:44pm

Jrinne,

Looking at the past two years is shortsighted to say the least. Most strategies on this site have a strong fundamentals basis. We have been in a growth regime for the last 10 years with market gains overweighted to a few high octane, high sentiment growth stocks which take on high debt at low interest rates with delayed profits. At the same time a lot of these stocks don’t have great fundamentals by a historical basis and would be unable to perform in an environment with less altered macro money flow dynamics. This is not typical of historical market activity.

Not to say that a lot of models aren’t flawed but if you are considering a model to be flawed because it underperforms for 2, 5, or 10 years then you are too short sighted. No model is going to reliably beat its benchmark every year without significant optimization. And then in that case it will be more likely to underperform thereafter. There have been several articles stating value investing is dead. Maybe it is but it was in 1999 too. And stock investing was dead in the late 1970s too. Keep in mind also that Buffet has $128 billion in cash accumulated. This is not a good environment for fundamental stock picking.

Chasing models that have performed the best the last few years IS chasing performance. Also finding a model that has performed well the last two years probably isn’t going to perform well over most of history.

Even staff at P123 recognize, now, that there might be a little overfitting at times. Bootstrapping is recommended as a possible method to alleviate this problem. But there are flawed models to be sure. If you want to wait 10 years to find out which ones are flawed that’s a plan.

This does not even take into account survivorship bias. I know I used to follow other strategies from some of the Designers that just are not there now. Even so, look at what would have happened if you placed equal amounts into all of a (particular) designer’s models that you can see.

Maybe you could have know which models to pick ahead of time.

The ONLY automated strategies that do this (rotate ETFs mostly) come from Georg, using machine learning techniques away from P123. Yes regression (and other techniques used by Georg) is a machine learning technique.

If P123 wants to stand back and let others do this they should. Marketing and new features is above my pay scale.

I am not against screening and then looking at the 10-K. I admire people that can do that. I cannot. Way above my pay scale and abilities: not really what I came to P123 for either. I would like to see someone’s out-of-sample results on this. I am not the only one able to post here. Besides, on the topic of Designer Models, are we even sure what factors to look at on a 10-K? Do we know enough about the model to second-guess unknown factors? How far will we go second-guessing a black box?

Anyway, I stand by my only conclusion: not a good 2 years by any measure. Everyone has their own, unproven, theories to explain this. We are just “short-sighted” is a story I am familiar with. One I deal with every day in the eye clinic;-) Not that applicable here, in my professional opinion.

-Jim

WalterW · November 17, 2019, 4:35pm

Interesting idea. Instead of checking the correlation of each portfolio holding against the Spyders, I tried looking at the portfolio as the instrument. Unfortunately P123 doesn’t present a portfolio’s equity curve as a Series so we can’t get there from here. Maybe, as the first baby step, P123 could offer a GetPortSeries function. What that in place, we could start experimenting with making a port that picks the lowest correlated ETF to a stock port and then combine both in a Book.

Walter

Schm1347 · November 17, 2019, 8:34pm

Jrinne,

I am not suggesting to stick by a strategy forever, but still 2 years is too short. I would argue that actually seeing some underperformance relative to a benchmark in the past at times may actually be desirable because it suggests that the strategy has not been overfit. It is not impossible to beat the S&P500 on a long term basis but you have to be willing to underperform it for periods of time. Most investors aren’t which is why indexing is better for them because at least they can match (or slightly trail). The S&P500 is not a magical index which picks the best stocks. It strives to do that but the rules for inclusion are readily available and can be simulated. At its heart it is a large cap, positive EPS, momentum strategy. So if you can design strategies that pick up on better characteristics without overfitting and becoming too complicated you should be able to outperform. You also need to accept that you aren’t going to beat the index every year. In fact a lot of active money managers have gripes about the very fact that because funds are scored year by year it is nearly impossible to engage in strategies which can outperform in the long run but may underperform for periods of time in the short run.

I will say that Marc is right. Don’t overfit your strategies and keep them simple. Complicated overfit strategies are an exercise in correlation hunting. Correlation is nothing more than statistical analysis of the past and without a strong logical fundamental basis means nothing.

Schm1347 · November 17, 2019, 8:40pm

That being said why is there so much interest in decorrelating portfolios. Just seems like another exercise in data mining. IMHO diversifying amongst other asset classes (bonds, commodities, cash, real estate, emerging markets) would be a lot more successful in lowering portfolio risk. Yes those other classes may put a drag on your gains when stocks are doing well but will can offer true alternative growth or stability when stock markets systemically fall.

Jrinne · November 17, 2019, 8:51pm

I am good with all of that.

Not sure I am making any of the mistakes you are cautioning against. I do hope people feel free to opine.

The only thing I do not like at all is having to pass new ideas that use substantial data through the P123 committee : already decades behind and getting worse.

You have nothing new that you want to do? Without my approval (and the rest of the forum)?

Sad if true.

-Jim

Schm1347 · November 17, 2019, 9:28pm

What do you mean by already decades behind?

Jrinne · November 17, 2019, 9:36pm

So you make my point about nothing being accomplished in the forum. If you are happy with the Designer Models you should subscribe ( if you are not already). You are free to do so.

If there is an error in my statistics let me know. You are free to interpret away as to the cause of this. I do not think I speculated so no disagreement on my part.

If you have a serious question about newer methods then email me. I will link to some books on Amazon.

-Jim

Schm1347 · November 17, 2019, 9:45pm

Based on your original question, in all honesty I am not sure there is more I want to do. I can see the trap of more analysis and data leading us to better fit our models. A year ago I might have answered differently but I’ve come to realize that investing is far from an exact science nor can be quantified as well as we would like to think our models suggest. At this point I am satisfied to simplify my models and try to look at ways to diversify away risk.

Schm1347 · November 17, 2019, 9:48pm

I’d be curious about newer methods if not to be informed.

I don’t use the designer models. I’ve built my own but quite frankly I haven’t executed them for a long period of time. I am suspicious of designer models because they are available for others to use and are somewhat black box.

Schm1347 · November 17, 2019, 9:53pm

I do share your concern that designer models appear to have overfitting bias as well. My own models underperform for periods of time but I try to combine them with other models and asset classes to even that out. I will say that my models are incredibly simple in comparison to some of the free Portfolio123 models. I am very leery of over complicating models…

Jrinne · November 17, 2019, 10:01pm

As long as you do not encounter a new idea I think you are good!

-Jim

Schm1347 · November 17, 2019, 10:06pm

Lol. So I should not read anything new. Right?

I’m not sure I would apply any new information but I would be curious to read what you are referencing. As I am a knowledge bug I’m always open to new ideas.

Jrinne · November 17, 2019, 10:25pm

“The Man Who Solved the Market” shows what can be done with a little math. And I do mean a little—Simons’ math IS decades old by today’s standards. Mercer is at the helm now. Simons had a Sharpe ratio of 7.5 at one point. Worth $27 Billion. I contend there is something to be learned.

If you want more modern methods: “Advances in Financial Machine Learning.” By de Prado. Very inexpensive.

If you remain unconvinced I am good with that. You are welcome to your opinion. I mention this only because you asked.

I think the P123 committee will not be considering any of these methods anytime soon anyway.

-Jim