I also enjoy content from @judgetrade.
While the analysis of tree mechanics is very insightful, it is naturally quite general. I decided to ground it more empirically within the P123 environment to analyze the economic rationale behind the trained model. I hope this helps less experienced users better grasp how tree-based strategies work.
I locally trained a Random Forest (RF) model using only factors from the Small and Micro Cap Focus ranking on a small-cap universe. The training period was 2012-2024.
I used a very simple tree algorithm (10 trees with a max depth of 2):
model = RandomForestRegressor(n_estimators=10, max_depth=2, random_state=8, min_samples_leaf=100)
Then, I asked Gemini to explain the model.
AI - GEMINI ANALYSIS
Model Breakdown: The "Sentiment-Led Growth" Strategy
This model represents an aggressive Fundamental Momentum strategy. It creates a hierarchy of decision-making that prioritizes Analyst Sentiment and Operational Growth above all else, while utilizing Industry Momentum strictly as a risk-management "kill switch."
Notably, this is not a Value strategy . Despite the availability of value factors (like FCF Yield or EV/EBITDA) in the dictionary, the decision trees ignore them entirely. The model does not care how cheap a stock is; it only cares if the fundamental picture is improving and if the industry is stable.
1. The "Big Picture" & Equity Styles
The strategy splits the investment universe into two primary buckets based on the root node of the decision trees. It asks one of two questions immediately: "Is Wall Street getting more bullish?" or "Is the company growing operating income?"
Based on the frequency and structural position of the factors, the style weights are:
-
Analyst Sentiment (45%): The primary driver. The model seeks stocks with upward estimate revisions, earnings surprises, and falling short interest.
-
Growth (40%): The secondary driver. The model requires both Historical growth (Operating Income) to enter the trade and Forecasted growth (Future EPS) to confirm the trade.
-
Sector Momentum (10%): Used exclusively as a risk filter. The model refuses to hold stocks in crashing industries, regardless of the company's individual quality.
-
Liquidity (5%): A minor filter ensuring the stock has sufficient trading volume.
-
Value (0%): Valuation metrics are completely excluded.
2. How the Decision Logic Works
The 10 decision trees operate on a distinct three-step flow:
Step A: The Gatekeepers (The Root Node)
Every single decision tree begins by checking one of two "Gatekeeper" factors. If a stock fails this test, it is almost immediately relegated to a "Sell" or "Neutral" bucket.
-
60% of Trees check $20 (Core Sentiment) . They generally require a Rank > 74-75.
-
40% of Trees check $16 (Operating Income Growth) . They generally require a Rank > 36-41.
Step B: The Closer (The Buy Signal)
If a stock passes the Gatekeeper (i.e., it has High Sentiment or High Growth), the model looks for confirmation using $2 (Forecasted EPS Growth) .
-
Even if Sentiment is high, if the Forecasted Growth ($2) is weak (Rank < 40), the model usually assigns a neutral score (0 or 1).
-
If Forecasted Growth is also high, the model assigns the maximum scores (+8 or +9).
Step C: The Kill Switch (Risk Management)
If a stock fails the Gatekeeper test (Low Sentiment or Low Growth), the model checks $12 (Industry RSI) .
- This is a "crash detector." If the Industry RSI rank is very low (e.g., < 8 or < 35), the model assigns massive penalties (Scores of -7 to -13). This prevents "catching falling knives" in collapsing sectors.
3. Feature Importance Statistics
The table below highlights which factors drive the actual buy/sell decisions. The "Times in Starting Node" column is crucial—it identifies which factors dictate the strategy's primary direction versus which factors act as filters.
| Rank | Factor ID | Factor Name | Role in Model | Total Usage | Times in Starting Node |
|---|---|---|---|---|---|
| 1 | $12 | Industry RSI (Momentum) | The "Kill Switch" | 9 of 10 | 0 |
| 2 | $2 | Forecasted EPS Growth | The "Buy" Confirmation | 8 of 10 | 0 |
| 3 | $20 | Core Sentiment | Primary Filter (Root) | 6 of 10 | 6 |
| 4 | $16 | Op. Income Growth (PYQ) | Secondary Filter (Root) | 5 of 10 | 4 |
| 5 | $26 | Median Daily Total | Liquidity Filter | 2 of 10 | 0 |
Note: All other factors provided in the dictionary are unused in the active model.
4. Factor Definitions
-
$20 (Core Sentiment): A composite factor defined as:
-
45% Estimate Revisions: Changes in Current Year and Current Quarter EPS estimates, plus variability.
-
25% Surprises: Earnings surprises in the last two quarters.
-
15% Recommendations: Change in analyst Buy/Sell ratings.
-
15% Short Interest: Short interest as a % of shares outstanding.
-
-
$16 (Op. Income Growth): OpIncGr%PYQ. Growth of Operating Income vs. the same quarter last year.
-
$2 (Forecasted Growth): (CurQEPSMean- HistQ4EPSActual)/Abs(HistQ4EPSActual). A measure of "Forecasted Momentum"—comparing the consensus estimate for the current quarter against the actual result from a year ago.
-
$12 (Industry Momentum): Aggregate("RSI(200)",#industry). The long-term relative strength of the industry group. Used to identify sector crashes.
TRAINED TREES:
| tree | rules |
|---|---|
| r1 | Eval($16<=41,Eval($12<=8,-13,-2),Eval($2<=41,-3,4)) |
| r2 | Eval($16<=37,Eval($12<=35,-7,-1),Eval($2<=40,-4,5)) |
| r3 | Eval($20<=75,Eval($12<=71,-3,3),Eval($2<=36,-1,8)) |
| r4 | Eval($20<=75,Eval($12<=73,-3,4),Eval($2<=41,-0,8)) |
| r5 | Eval($20<=74,Eval($16<=36,-5,1),Eval($2<=36,-0,8)) |
| r6 | Eval($20<=74,Eval($12<=72,-3,4),Eval($2<=37,-0,8)) |
| r7 | Eval($16<=36,Eval($12<=35,-7,-1),Eval($26<=70,1,8)) |
| r8 | Eval($16<=41,Eval($12<=8,-12,-2),Eval($26<=71,1,8)) |
| r9 | Eval($20<=75,Eval($12<=75,-3,4),Eval($2<=37,-1,9)) |
| r10 | Eval($20<=74,Eval($12<=35,-5,1),Eval($2<=40,-1,8)) |
FACTORS USED:
| factor_name | factor |
|---|---|
| $1 | FRank(LoopSum("EPSExclXor(Ctr,Qtr) > EPSExclXor(Ctr+4,Qtr)",6,0), #all, #desc, #NANeutral) |
| $2 | FRank((CurQEPSMean- HistQ4EPSActual)/Abs(HistQ4EPSActual), #sector, #desc, #NANeutral) |
| $3 | FRank(EPSExclXorGr%PYQ, #sector, #desc, #NANeutral) |
| $4 | FRank(FCF%AssetsQ, #all, #desc, #NANeutral) |
| $5 | FRank(FCFGr%TTM, #sector, #desc, #NANeutral) |
| $6 | FRank((FCFQ-FCFPYQ)/MktCap, #sector, #desc, #NANeutral) |
| $7 | FRank(CurFYEPSMean / Price, #industry, #desc, #NANeutral) |
| $8 | FRank(ConsEstMean(#EBITDANTM,0)/EV, #industry, #desc, #NANeutral) |
| $9 | FRank(CurFYSalesMean/MktCap, #industry, #desc, #NANeutral) |
| $10 | FRank(Abs(GrossPlantA/SalesA-FMedian("GrossPlantA/SalesA",#industry)), #all, #asc, #NANeutral) |
| $11 | FRank(GrossProfit%AssetsA, #industry, #desc, #NANeutral) |
| $12 | FRank(Aggregate("RSI(200)",#industry), #all, #desc, #NANeutral) |
| $13 | FRank((InventoryA-InventoryPY)/AstTotPY, #all, #asc, #NANeutral) |
| $14 | FRank(LoopMedian("ROE%(Ctr,TTM)",12), #all, #desc, #NANeutral) |
| $15 | FRank(MktCap, #all, #asc, #NANeutral) |
| $16 | FRank(OpIncGr%PYQ, #sector, #desc, #NANeutral) |
| $17 | FRank(Abs (PEGST - 1), #all, #asc, #NANeutral) |
| $18 | FRank(ROA%Q, #all, #desc, #NANeutral) |
| $19 | FRank(SalesGr%PYQ, #all, #desc, #NANeutral) |
| $20 | FRank($R_Core_Sentiment, #all, #desc, #NANeutral) |
| $21 | FRank(MedianVol(252)/SharesCur(126), #all, #asc, #NANeutral) |
| $22 | FRank(MedianVol(65)/SharesCur(0), #all, #asc, #NANeutral) |
| $23 | FRank(Aggregate("TotalReturn",#subindustry), #all, #desc, #NANeutral) |
| $24 | FRank((OperCashFlTTM - CapExTTM + (1-TaxRate%TTMInd/100)*IntExpTTM)/EV, #all, #desc, #NANeutral) |
| $25 | FRank((SMA(150,21)-SMA(150,252))/ATRN(150), #all, #desc, #NANeutral) |
| $26 | FRank(MedianDailyTot(126), #all, #asc, #NANeutral) |
| $27 | FRank(AvgVol(13)/AvgVol(13,30), #all, #desc, #NANeutral) |