P-hacking and ETFs

from SSRN-id4078138

“A potentially useful testing group is newly launched ETFs. Many new ETFs have claimed to be based on peer-reviewed research published in the finest academic journals. Few investors realize that peer-reviewed research could have been p-hacked or overfit to such an extent that the results are unlikely to repeat out of sample. Indeed, the evidence points very starkly to this phenomenon (see Brightman, Li, and Liu (2015)).”

Exhibit 4 illustrates the market-adjusted return of all ETFs. Notice that the backtested returns are strong. After application to the SEC and the subsequent launch of the ETF, however, the excess returns are zero. This outcome is consistent with overfitting and/or p-hacking.


Thanks. I would only add that these are aggregated results and some ETFs could have continued to outperform out-of-sample.

BUT looking at the error-bars, it looks like EVERY ETF had regression-toward-the-mean. So even if a few of these ETFs had some merit and were based on good ideas they did not do as well out-of-sample as they did in the backtests. If the error bars represent the full range of the ETFs’ performance then not a single ETF kept up with its in-sample performance. And if the error bars represent a confidence interval (the paper is not clear on this), I think it is fair to infer that with any confidence interval they might have used, no (or almost no) ETF kept up with its in-sample performance. Seems like a very rare thing—as one might expect—for an ETF to keep up with its in-sample performance after it goes live.

Do you think the designers of ETFs don’t understand that? Or have not leaned after trying this a few times? Serious question.

I am not saying that some ETFs or models might no have merit (I think some do in fact) but rather I am saying that even if the model does have merit it will not do as well as the backtest would predict.

This is consistent with your point or the point of the paper (and supports it) as you know. It is just a slightly different interpretation of the data, I believe.


There is likely a mix of p-hacking but also that alpha mean reverts. For example, value has not done well for most of the past 10 years. I expect it to mean-revert somewhat over the next 10. It would be unusual for an ETF to have an idea that worked well up until 10 years ago. Most strategies want to show strong recent performance over the past 5, 10 and 20 years. But this creates a case for lower alpha in the following years. What would be more interesting is a study of ETFs methodology over time. 50 years previous and 20 years after inception. Is there some long-term average of the strategy return which comes back over time?


I agree with everything you said.

I would only add that “mean-reversion” and “regression-toward-the-mean” are actually 2 different things—with both occurring here.

Regression-toward-the-mean, as you probably know is purely based on UNCHANGING probabilities—like a baseball player on the cover of Sports Illustrated who just cannot remain that lucky with his batting average (this is called the Sports Illustrated Curse in statistics). Mean-reversion is more active. For example, value companies cannot keep doing that well because their value is inflated in the market now (because of interest rates or whatever).

From Wikipedia which does not say it very well in my opinion but hints at the difference: Mean reversion (finance)

“In finance, the term “mean reversion” has a slightly different meaning from “return or regression to the mean” in statistics. Jeremy Siegel uses the term “return to the mean” to describe a general principle, a financial time series in which “returns can be very unstable in the short run but very stable in the long run.” Quantitatively, it is the standard deviation of average annual returns that declines faster than the inverse of the holding period, implying that the process is not a random walk, but that periods of lower returns are then followed by compensating periods of higher returns, for example in seasonal businesses.[8]

Regression-toward-the-mean is a universal law at play everywhere including—but not limited to—in finance.