Persistent Micro-cap Drawdown

Hi All,

Several of the micro- and small-cap strategies I'm working on faced a persistent headwind in mid-2024. Other segments of the market didn't seem to face this challenge. And risk mitigation (breadth) strategies didn't suggest leading deterioration either.

Does the community have any insights on what caused the 2024 micro-cap drawdown? Thank you

Chawasri

I'm pretty sure there was a period of factor inversion. There's another one right now. You can test this by looking at the performance of the Core Combination ranking system over select periods on a universe like the S&P 1500. Right now a large number of the factors that we P123 users rely on are behaving upside-down, with the worst stocks rising the fastest and the best falling on their faces. The Stock Evaluator, a ranking system that I've been imparting to subscribers on Seeking Alpha since 2018 or so, and which has a very nice out-of-sample track record, looks like this over the last four weeks, with weekly rebalancing on the S&P 1500 (all numbers are annualized).

4 Likes

Back to memes ... This is the output of a ranking system based only on Short Interest to Float backtested on the Easy-to-Trade Universe with non-missing SI%Float. Last 4 weeks only. The upcoming Short Interest Risk Factor should help flagging this possibility.

They did well early on and recovered by year end though. I don't know if persistent drawdown is the right word for the population as a whole.

Value did fine in 2024 for microcaps.

Same thing for P123's Buffett's alpha rating

Thank you for replying, Yuval. What made me wonder about this was a post you made several months ago about small/micro caps struggling in 2024. When I saw it happening in my Sims, I thought it would be worth mentioning.

I ran P123's "Small and Micro-Cap Focus" RS on the SP1500 for the last four weeks, and the top bucket shows a return of -68.79% compared to the universe!

I remember something similar occurred in late 2007. Later, it was dubbed the "2007 Quant Quake." While the overall market was performing well, quant desks recorded big crashes. Some of the more leveraged factor-based hedge funds went out of business. I found some ideas for the cause of that anomaly in the Gemini summary here.

While the SP500 has been posting record highs recently, many indicators suggest the footing may be unsteady in the coming days and weeks. For example, the equal-weight SP500 index (RSP) is only now nearing its pre-liberation-day high. Same for Mag-7 stocks. I'm not a chart-watcher by any means, but I saw a technical analyst on CNBC calling for a possible "double top."

Breadth is currently weak for mid-caps, small-caps, and other market sectors. Eight out of the 11 sectors were down today. Defensive assets such as gold (GLD) continue to climb while the market makes new highs, which is quite unusual.

Following the incredible 20% run for the S&P after the April 24 Zweig Breadth Thrust, a pullback seems reasonable, and the tariffs may be enacted (possibly) this weekend. Seems like the stars are aligning for some volatility.

I read on that some time ago (I don't recall exact the paper) that when something like that (inversion) happens in Large Stocks...the bullish market is close to finish.

2 Likes

Yuval -

"I'm pretty sure there was a period of factor inversion."

Do you have any further information or articles about factor inversion? I searched and couldn't come up with much from Google or Gemini. I find this subject rather fascinating, particularly if it's happening right now; we may be able to learn something from it. Moreover, it's another reminder that just when you think you have things figured out, the market always seems to find a way to eliminate any creeping hubris.

On the other hand, maybe it will disappear soon...

Chawasri

Great thread. In 2023 I did some research on what I call "factor cycles".

Factor inversions do in fact happen, and some factors/systems invert more than others.

We can also track how they invert, or cycle, over time.

Here is the spread of annualized performance, of highest and lowest deciles of the P123 Core Value ranking system, over time. When the curve is "above" the x-axis, the highest decile is outperforming the lowest (or value > expensive). When below the axis, or a negative spread, then expensive > value.

You can see here just how many times value "inverts", when expensive outperforms value, since 1999.

Here is a thread in Twitter/X I shared on said research, with other factors.

https://x.com/RTelford_invest/status/1652353241571549189

And specifically for microcaps:

https://x.com/RTelford_invest/status/1653533092416565249

4 Likes

I wrote about it at length a few years back: Red Bull Investing: Why The Riskiest Stocks Have Been Vastly Outperforming Safe Ones - Portfolio123 Blog

3 Likes

I have a few AI Factor Strategies that Im running live out of sample to evaluate if I want to put real money in them, and they just made a big rotation into defensives like healthcare and energy. Meanwhile the low quality stocks look like they're starting to roll over this week. We will see how well the robots have timed this

Rtelford – great idea! Thank you.

Here are my results over the past 5 years (my features are inverted now as well):

For those familiar with Claude/Anthropic Artifacts, I used this one:

:link: https://claude.ai/public/artifacts/1847c7f1-1b63-443a-9155-2551d767e8db

It uses an upload of a standard P123 20-bucket, 5-year rank performance test with a 4-week rebalance download.

You can easily modify the artifact to use your own number of buckets or rebalance period to check whether your features are inverted in your own ranking system. This should work with a free account.

Thanks again!

BTW: A Runs Test (Wald–Wolfowitz) on my results suggests the inversions are statistically random— p = 0.72, so there’s no significant evidence to reject the null hypothesis of randomness of factor inversion (for my ranking system over the last 5-years using this particular test).

-Jim

1 Like

TL:DR: Do you want to know if your signal is statically significant and do you want to know how much variation there is in it? (see poll at bottom)

Longer version:

Recently, I’ve spoken to Marco on ideas to strengthen the platform. From my perspective, two critical pieces are missing:

  • A Risk Model
  • Portfolio Construction Module/Techniques

To test this, I built some tools and want to see if others here want it.

A motivating example:
Say you’ve built a ranking system with a nice-looking performance chart. The top bucket beats the bottom bucket, it monotonic, so it looks like you’ve got a clean signal. Let's look at the p123 composite (a good model):

But what happens when you actually trade it?

  • Sometimes the signal flips: the bottom bucket outperforms the top.
  • Sometimes that “inversion” lasts weeks or months.
  • Sometimes the recovery is quick, other times it drags on.
  • And last but perhaps most importantly, if you only hold 20 stocks, the real spread can be far messier than the smooth chart you see in research. Most IC's are <.05 (i.e. individual stock predictions are not strong so you need to make many independent bets to realize that backtest, assuming it holds out of sample).

This feature would automatically show you:

  • How statistically significant your signal really is (vs. just noise)
  • The exact periods when your signal inverted, how deep it went, and how long recovery took
  • Statistical Top–bottom bucket spread adjusted for all buckets.

Why it matters: almost all signals are noisy. Current charts can give a false sense of security, like the future will look as clean as the backtest. By surfacing the messy periods, these tools help in two ways:

  1. You can improve your signals by studying when and why they break.
  2. You can manage expectations better during inevitable drawdowns.

Factor inversions, drawdowns, etc. have plagued the platform, imo. Here is some output using the community composite ranking system and the traditional bucket performance. We'd explain the stats for users in plain language.

This is a first cut but the heavy lifting is done. A thumbs up or down if you'd like this would be helpful.

Notice how often 10 buckets invert. These would be times you'd be lagging the market or losing money.

Here's some of the inversion people were talking about this year:

Here are some stats using 100 buckets (notice how low IR is).

Here's a sample historical output:

  • Helpful
  • Not Helpful
0 voters
4 Likes

You can obtain weekly statistics as well already that can look at performance at a very granular level. Not sure if users would pay more for this feature, but it would be a nice to have.

1 Like

I have noticed that some (technical factors) does not invert.
You could analyze all your current single factors and only keep those that does not invert (or invert the least).
But that raises the question: Would you rather outperform the market the most in the long run with invertion periods? Or consistent beat the market for a lower total performance?

TL;DR: Even highly advanced users like Korr could benefit from more customization options within P123—without needing a formal feature request for every idea. That said, I’ve mostly resigned myself to using downloads, unless P123 sees open, customizable tools as a way to grow and market the platform.

People can now download the rank performance test, upload it into Claude and run the artifact to get a graphic of feature inversions.

This is just an example, A member could just as easily ask Claude to create an artifact that gives a t-score. Too many steps (with the downloads) I know.

But the reason I mention it here is artifacts run in the browser. P123 could off-load some of the computation. P123 could give us a place to share artifacts and maybe allow them access to some of the data without the download and upload steps.

Also, I like all of Korr's ideas but I would probably do it slightly differently. I would not mind taking his ideas as an artifact and customizing them..

I really think P123 needs to give users more flexibility that doesn’t require a formal feature request for every creative idea we have. Almost no feature requests get implemented—and understandably so. It would be impossible to fulfill them all.

Maybe not with artifacts per-se but something like it that runs in the browser, that is open source and members can contribute and could be modified by Claude (or another LLM). Members (and Claude) could do some of the heavy-lifting as far as providing feature requests in other words.

A bit of a repeat but this was a popular feature that P123 will never implement as a feature: Python program to find correlations and multicollinearity

Algoman's feature could not be done with artifacts or anything as simple. But we could start with artifacts (or something like it) and work on some open-source Python code (including Algoman's popular code) later.

Briefly we had code that was run in Colab which had direct access to P123's data. A sandboxed safe environment where code could be modified. AWS could be used too.

This feature is really just like the API however. But for mew members not yet ready for using an API there are things we could do to attract them to P123.

Imagine a beginner-friendly, low-code path to API-style workflows—and a community space to share, modify and improve them. That’s the idea.

BTW, Claude Artifacts could easily handle tasks like randomizing weights for the optimizer. That sort of thing is trivial—and it opens the door to automating and customizing the spreadsheet workflows many of us already use.

Take this example: Pitmaster described a great grid-based weight optimizer (presumably in Python):

This is exactly the kind of task that could be turned into an Artifact.

I don’t currently use the optimizer myself, so I’m not sure what the final upload step would look like—but generating weights on a grid and saving them to CSV (or having something to copy and paste into the optimizer) is well within Claude’s reach.

So I asked Claude to start building it using this prompt:

“Could you modify this idea to run in an Artifact? A user might have 50 weights and want to randomize or change them, then plug those weights into a ranking system or optimizer. For now, that might require a CSV file that is uploaded into the optimizer.”

Here’s the first draft of that Artifact :

:link: https://claude.ai/public/artifacts/25b0a615-c04c-4614-b222-fc467ecdf30e

You (or Claude) can easily modify it to do exactly what you want. And I doubt this will ever be implemented as a formal feature request—but we can have it now using downloads and a bit of copy-pasting via Artifacts.

Maybe there’s even a way to skip the download/copy/paste step without needing P123 to write backend code.

Let us handle the ideas and coding—just give us a place to plug things in.

—

If P123 supported even basic Artifact-style tools in the browser—and optionally allowed them to tie into ranking or optimizer inputs—this kind of thing could evolve into a flexible, user-powered toolkit, all without requiring backend development time.

1 Like

@SZ The idea wasn’t about creating something extra to pay for. My intent was simply to explore whether this could be useful for other users. I haven't found a pricing model here on p123 features so it's hard to say in advance what something is worth.

I noticed a number of posts from members asking what’s happening in the markets, as well as @yuvaltaylor’s explanations of factor inversions. I thought a tool that helps users better understand their strategy, whether it’s performing well, going through a drawdown, or lagging in total returns, could add clarity and preparation.

@Jrinne I took a look at artifacts again. I don't use Anthropic. How does this work with external data? Most of the stuff I've been doing involves pulling through the API and storing locally. Do artifacts do that? Or is this more or less like github in the sense we're collaborating with code using Anthropic as the interface?

Thnx for the feedback.

Thanks for looking into this for everyone