Here is the critical sentence in the paper, the one that opens Section 3.3.1: “Empiricists in the anomalies literature have much flexibility in test designs.”
Ere’s an example with one of his factors, dividend yield ( he labeled it a.2.14 Dp, Dividend Yield and this is from page 76 of the pdf):
“At the end of June of each year t, we sort stocks into deciles based on dividend yield, Dp, which is the total dividends paid out from July of year t−1 to June of t divided by the market equity (from CRSP) at the end of June of t. We calculate monthly dividends as the begin-of-month market equity times the difference between returns with and without dividends. Monthly dividends are then accumulated from July of t − 1 to June of t. We exclude firms that do not pay dividends. Monthly decile returns are calculated from July of year t to June of t + 1, and the deciles are rebalanced in June of t + 1.”
Based on the testing approach, examining the significance in return between the top and bottom decile, the authors got what they were supposed to get; no benefit.
High yield dividend-paying stocks are not supposed to outperform low-yield dividend-paying stocks, If anything, we expect the reverse. High dividend yield is such because the market expects the dividend to be cut or eliminated, and the market’s track record in predicting this sort of then has been pretty good. If you want to use yield as a factor, you have to create a specialized sub-sample defined by companies for which dividends are not likely to be reduced or eliminated.
The same holds true for every factor. None can ever be expected to work for an entire universe; all have to be applied to a subset. For example, low P/E can only be preferable when applied to a universe of companies with better growth potential and/or less risk than the market assumes. Etc., etc. etc.
The paper proves a point, but it looks like it’s not the one they thought they were proving. They are proving that pure mega-sample quant analysis accomplished nothing. And this is a great thing for us. Unlike researchers like this, we have screening/buy rules and custom universes, so we can study and profit from anomalies don’t even know enough to be studying. So the more papers like that come out, the better things get for us as our trades can get less crowded.
As for the use of statistics – it’s great BUT BUT BUT:
S - DK = BFM
-DK = CSD
Therefore,
S + CSD = BFM
And,
BFM = OPU or S - DK = OPU or S + CSD = OPU
where,
S = Statistics
DK = Domain Knowledge
BFM = Big Fu**ing Mess
CSD = Crappy Study Design
OPU = Opportunities for Portfolio123 Users