Disconnect: Backtest vs. Real World

I had a P123 account several years ago but wound up cancelling it. Here’s why:

I have many years of experience in developing trading systems. I’m a stickler about keeping my systems simple and not overfitting to past data. I test in-sample and out-of-sample on as much data as is available.

I had come up with a system which backtested quite well and traded it for a few years. The problem is, there seemed to be a disconnect between my backtested returns and my real-world returns. My backtests were doing quite well on paper but my real-world trading was doing quite poorly.

I have not dug deeply into what I think is this disconnect. Has anyone else experienced this? I don’t know if it was market conditions or what. I’m thinking I should have done a lot better than I did.

Four years ago I stopped trading my P123 system and got into holding leveraged NDX funds for the long term (QLD and TQQQ). Since then I’ve been doing better than ever. The funds now account for the bulk of my holdings and profits. There is less trading and my tax accounting is much simpler. Unfortunately, since then I’ve been playing catch-up to compensate for my dismal P123 returns and yes, losses. On my income-tax return I’ve got a whopping capital-loss carryforward to show for my efforts. I had to put the brakes on the dismal returns.

Now retired, I’m back to P123 and have developed new screens. The stock picks comprise a very, very small portion of my portfolio and have been doing adequately well the past few months. Previously my entire portfolio consisted of P123 picks but no more. Now they are a tiny fraction.

My question stands: has anyone else felt there is a disconnect between their backtested system returns and their real-world returns going forward, or is mine an isolated case?

Which time period was your real world returns?

Yes. And the corollary, what was the time period for the backtest?

Strategies can underperform for many years. A simple “Value” strategy, which has outperformed “Growth” since 1926, underperformed from mid-2007 to mid-2020. A lot of investors will abandon strategies that underperform for 3 years, much less 13 years.

For the sake of discussion let’s say 2010 - 2017, when I switched to LETF’s.

I’m sure you’re correct about strategies underperforming for years. That makes an excellent case for indexing and reinforces my beliefs.

To give you some perspective, as I write this, individual stocks comprise about 1.2% of my portfolio. The rest are funds.

There is definitely an argument to be made for indexing. I do believe that active investing can work, and I rely on it myself, but it can take considerable effort and a recognition that, with active investing being a zero-sum game, you must be willing to think and behave differently than the crowd without any guarantee that doing so will deliver the results you seek.

You might check out Yuval’s excellent webinars and various thread posts about how to approach strategy building here on P123.

Hi Chris (PepeLePew),

Your experience was very similar to mine, except that I had been doing it far longer. I started working as a professional in the industry in 1981 for firms such as Goldman, Drexel Burnham, Merrill Lynch, and others before launching a hedge fund (Orion Capital) in the 1990s.

I saw the benefit of quantitative investing from the beginning and was inputting paper-based stock data ValueLine into an IMB-PC clone in the early '80s, doing rudimentary value investing. It was like picking cherries in the early years and worked great for a couple of decades. After selling my portion of the hedge fund and taking a 2-year vacation, I got bored and started a web-based quant site on the 'net. In fact, I think may have created the first rules-based stock-investing site on the internet with IntelligentValue.com in 1998.

I joined P123 in 2004 when there were few other options, and value investing worked great until about 2007 or so when I saw a distinct decline in results. The only thing I can attribute this to was the growing popularity of data-based, computerized investing, making trades ever more crowded. I stuck with it, but I was forced to get more creative to get decent returns. In other words, I had to go where everybody else wasn’t to find good returns.

I had added ETF strategies to the product mix starting in 2000, but there were very few ETFs to choose from back in the early days. After the explosion of new ETFs in the mid-'00s, I built and offered several more strategies to the public. They worked great to avoid the Financial Crisis by switching at the right time to defensive ETFs. As time went on, I found that the ETF-based strategies were far more reliable and steady than the stock-based strategies, and they were attracting an ever-increasing share of my customers. The ETF market has been growing by 20% to 25% per year since the mid-1990s and is the fastest-growing financial product in history. There are good reasons they are attracting more than a trillion new $ every year (and growing).

I assessed that individual company stocks suffered greatly from idiosyncratic risk, while (of course) ETFs instantly eliminate that risk. You could create a system that picked the best stocks in the world, but invariably something would happen that your rules couldn’t discern, and the stock would blow up. An individual company might file a poor earnings report after the market closed, and by the time the shares opened the next morning, a small company’s price could be down by -50% or -75%.

Other times there would be an out-of-the-blue news story that would sink the shares, such as a class-action lawsuit, an FDA or SEC investigation announced, a key CEO pilfered by a competitor, and a plethora of other reasons. Because of this idiosyncratic risk, the shares might take a terrible percentage decline faster than you could react, and this risk couldn’t be eliminated or reduced with algorithmic investing. By 2017, I could see the writing on the wall and made a big decision to change my website’s name to ETFOptimize.com, eliminating the individual-stock portfolios and strictly offering ETF models.

However, as we all know, life is never without challenges, and the Covid Crash combined with Fed’s mind-boggling market manipulation totally upended all the macroeconomic and fundamental rules that had been a big part of my ETF Strategy’s ranking systems and buy/sell rules. Most of them didn’t work to identify the approaching risk in time, and after the Fed poured $3 trillion into the financial system, they were extremely slow to get back in for the rally (while the country was still in a terrible recession).

The last two years have been a real psychologically turbulent period because after my models scored returns of 90%-100% in 2019, they were rotten in 2020. Just when you think you’ve got it figured out, the market will punch you in the mouth (to paraphrase Mike Tyson).

However, I revised the ETF-based models over the last year, eliminating most of the macroeconomic measures, and developed some highly sophisticated (yet simple) technical indicators to identify periods of increased risk. I figure that technical measures can never go bad because… well… because they are based on price, which is the aspect of investing performance that we are trying to maximize anyway. Price never lies.

By using ETFs with rules-based investing, I find that I have far more accurate and reliable results. Several stocks crashing every day on bad news doesn’t make a difference because you have several hundred other companies in your ETFs that don’t. Plus, I screen the ETF universes to eliminate the ones that are illiquid (hint: daily trading volume doesn’t work for this). The only way an ETF could ever drop -75% is if there were a nuclear war that wiped out 75% of American commerce. That gives you peace of mind that’s hard to find elsewhere. As a result, my ETF models have a record of 100% profitable years - collectively, that’s 92 of 92 years of consecutive profits across my model lineup.

While their average return of 30% isn’t going to set the world on fire, the combination of steady profits of 30% year after year until retirement can make you very wealthy. Moreover, you can sleep at night - which is priceless.

Thank you, ETFOptimitze, for a very articulate post. I will definitely have a look at your web site.

Since 2017 I’ve become more of a buy-and-hold man. Taxes played a major role in this decision. Now active in-and-out trading constitutes a tiny sliver of my portfolio done more for entertainment than anything. Let’s face it, as profitable as indexing is, it is boring :slight_smile:

I see this all the time. The indices are up but my individual holdings are down.

In another post I describe an idiosyncrasy I found with P123 which may account for the disconnect I described. I did some more testing over the weekend and believe I have a handle on it now. If the backtest results from P123 can’t be depended on 100%, then P123 becomes no better than a dart board or a random-number generator and the backtest results are fool’s gold.

I’ve also been influenced by Warren Buffett who is definitely a buy-and-hold man who doesn’t jump in and out of positions. I have long-term holdings in GOOGL, GS and REGN, not picked by P123, which have done quite well and which I don’t plan to sell. They are what I call “tax locked”, meaning that the gain is such that I would rather hang onto the stock than sell it and pay CG tax on the gain.

Hello Chris,

Just checked out your site are you really running at OS at 30% for 13 years?


The Art of System Trading is to have a bucket of systems and allocate to the systems based on the market regime and / or their relative strength.

Nothing works great all the time.
There are some secular trends (right now growth due to low interest rates) but they may change too.

You can tame this due to momentum and industry momentum and a wide spread of factors in the ranking system, but there are not always (price and factor) trends long enough to be captured.


I think this can be exactly right for some situations. Certainly this has been my experience.

I had a whole bunch of systems when I started at P123 including Piotroski etc.

One of the sims did particularly well. I renamed it 72. That was because it seemed destined to have a 72% annualized return. It seemed focused on that number like a missile locked on its target

And the port did not disappoint. In retrospect, probably an example of recency bias. But like the sim, it rose like a rocket. Understandably, I moved money into the port (some out of that dog-Piotroski-port). I REGRET that I did not move money into it fast enough. But I did move some money into it. At one point the port bought HEAR before an earnings announcement and I made a quarter-of-a-million dollars in a couple of days. People in the OR asked what I was doing still there (with considerable tongue in cheek). Of course, a quarter-of-a-million dollars does not go as far as it used to.

But then the reverse happened. As quickly as it rose, it began crashing back to earth. Naturally, I began removing money from that port, while some “expert advice” on the forum (to P123 members in general) was to stay-the-course. But I kept removing money as the port reached terminal velocity and crashed to earth. Another REGRET I have is that is that I did not remove money fast enough.

So I recently made some money on another port that, at the end of the day, is probably another example of recency bias. How much regret I will have remains to be seen.

This talk of minimizing regret on both the up-side and the down-side as well as weighting of expert advice is not really an academic exercise for me. Apparently, it can sometimes be done instinctively. It can be done with some relatively simple algorithms too that are pretty amazing in that they can offer a guarantee on the maximum amount of regret you will have at the end of the day. They can do that out-front before you begin to trade. Those algorithms interest me for some reason.

Whether the algorithms are the way-to-go is not entirely clear. But it does not take a lot of mathematical understanding to know they would have had me buying into my port as it moved up and had me getting out of my port as it moved down. Both saving me and making me a good amount of money.

Whatever the best way to do it is, I apparently agree that equal weight and stay-the-course is not alway optimal in all situations. Not that it isn’t best for some situations. Everybody’s models are different.


More great cases for boring indexing.

There is a risk of being in the wrong system at the wrong time. If you misjudge the market you’re screwed as subjectivity enters the equation.


I definitely get that.

You are at about 2% in stock ports? I am at exactly 5% now. Largely for reasons you have already mentioned.

I do think that whatever one does, it doesn’t have to be always equal weight though.


With Yuval’s help I straightened out some issues with my screen.

I had a rule Close(0) >= 5 which is counterproductive because the historical database is split adjusted and the rule messes up the picks. I was thinking I would avoid penny stocks; instead I use the NoOTC universe and the Close(0) >= 5 rule is gone.

I also had a rule EPSActual > 0. This rule was moved to the Universe, making the backtest results more stable. As a result, now I have an even smaller proportion of P123 picks in my portfolio as the screen picks just two stocks.