Which ranking systems do you use?

I’m not sure I quite understand where you’re coming from, but some observations/comments based on general theme of questions and where I think you’re coming from:

  • I test on biggest universe that meets my required liquidity, not a subset of stocks. (I exclude financials and real estate). I’ve tried subsets with certain characteristics as test universes, but I’ve found it’s usually better to just test on larger universes and let the ranking system sort things out.
  • Probably test variations beyond 20 top holdings. I used to have a detailed and specific testing process, but now normally simplify and equal weight the top 5% of a ranking system. When I refine things to fewer holdings performance gets better, but I feel the top 5% gives me decent signal on quality of ranking system without worrying too much about idiosyncratic stuff going on. Maybe I used to think I could find the “perfect” way to express a factor and combine factors and weights, but over time as I’ve seen the market change and factors wax and wane I’m thinking I was trying to apply too much precision to something that’s more messy.
  • I like to test from 2007 forward to capture the financial crisis - but that is getting to be a little bit long in the tooth. The problem is since 2009 the US market has been in an almost persistent secular bull market with lower/lowering rates and only very brief periods of crash/rapid recovery. I worry about that, but not sure how to handle other than worry about it. I feel like going back to 2000-2003 bear mkt is too far back and factors worked very differently back then (many were much more effective then than they are now).
  • Ranking systems are the core of my stock selection process, but I do have a few screen rules to set minimum requirements. Ideally I would incorporate all the screen rules into my ranking system, however, I’ve had a few cases where I’ve been unable to make that work and use screen rules rather than contort the ranking system. In general, for the best backtests I think it’s usually best to put as much as I can into the ranking system. (As to specific screen rules variances - if I want stocks to have positive revenue growth or some maximum amount of debt load or positive earnings, I may have to hard code that into the screen rule because the ranking system will let some through no matter what I do. In my experience doing this almost never improves system performance - but since I find it hard in “real life” to buy certain types of companies I’ve found this approach helps me better stay in the game. But to be clear - the tradeoff is that the backtest almost always decreases when I do this. Also be careful of curve fit at this stage. I do this because I have Buffett’s voice in my head reminding me that the market doesn’t have to open tomorrow, and therefore may have a tail risk scenario where I have to hold things for a long time.
  • My ranking systems are kitchen sink approaches. I’m pretty sure I use something identical to P123 Sentiment as one component of some of my ranking systems, but it’s just one factor of many.
  • Beating the market is hard - and John Bogle points out as a truism - after backing out fees, the average investor return must be below average return for the market. But 25% annualized returns are extraordinary and will make a legendary career if sustained over time. Something about that universe seems suspect imho.
  • My ranking systems changed a lot my first 2-3 yrs here as I tried all sorts of stuff, all sorts of factors, and learned all sorts of lessons, but have stabilized over the past year or so. I still try out new ideas, but my key learning is for behavioral reasons I needed to use a system that gelled with my personality and concerns even if it didn’t have the best backtest. Obviously still learning and taking things in, with open mind toward what I see - but strong respect for observation that quant approaches can lead me to companies that don’t (currently) make sense to me at both an intellectual and visceral level, even if the backtest is great.

Hope some of this gets at your question. I’m no pro - and make plenty of mistakes - but wanted to share this in case it’s helpful.

It could be that I have constructed the universe incorrectly, or that it is a surviorship bias. It is composed of current shares of Founderled companies in https://custom.firstrepublic.com/2020-annual-report/about

Ticker Name Added Removed Days Held Return
FLWS
1-800-FLOWERS.COM Inc 10/24/2021 — 6 10.45%
DDD
3D Systems Corp 10/24/2021 — 6 5.15%
NSP
Insperity Inc. 10/24/2021 — 6 1.67%
ADBE
Adobe Inc 10/24/2021 — 6 1.05%
AFL
AFLAC Inc 10/24/2021 — 6 -5.08%
AKAM
Akamai Technologies Inc 10/24/2021 — 6 -2.29%
ARE
Alexandria Real Estate Equities Inc. 10/24/2021 — 6 -1.97%
AMZN
Amazon.com Inc 10/24/2021 — 6 1.11%
PLD
ProLogis Inc 10/24/2021 — 6 0.67%
HES
Hess Corp 10/24/2021 — 6 -6.85%
AKR
Acadia Realty Trust 10/24/2021 — 6 -5.15%
ADC
Agree Realty Corp 10/24/2021 — 6 1.62%
AFG
American Financial Group Inc 10/24/2021 — 6 -2.35%
ADI
Analog Devices Inc 10/24/2021 — 6 -2.76%
AJG
Arthur J. Gallagher & Co. 10/24/2021 — 6 1.51%
BRK.B
Berkshire Hathaway Inc 10/24/2021 — 6 -0.77%
CAKE
Cheesecake Factory Inc. (The) 10/24/2021 — 6 -4.04%
CIEN
Ciena Corp 10/24/2021 — 6 3.08%
CTAS
Cintas Corp 10/24/2021 — 6 1.47%
CLH
Clean Harbors Inc 10/24/2021 — 6 -1.95%
CGNX
Cognex Corp 10/24/2021 — 6 3.14%
CPRT
Copart Inc 10/24/2021 — 6 1.98%
CRVL
CorVel Corp. 10/24/2021 — 6 4.84%
CSGP
CoStar Group Inc 10/24/2021 — 6 -13.33%
CCI
Crown Castle International Corp 10/24/2021 — 6 0.66%
DHI
D.R. Horton Inc. 10/24/2021 — 6 0.77%
DHR
Danaher Corp 10/24/2021 — 6 -0.51%
DISH
DISH Network Corp 10/24/2021 — 6 -5.19%
EPR
EPR Properties 10/24/2021 — 6 -1.99%
EQR
Equity Residential 10/24/2021 — 6 1.16%
ESS
Essex Property Trust Inc. 10/24/2021 — 6 1.36%
EL
Estee Lauder Cos Inc (The) 10/24/2021 — 6 -1.01%
EEFT
Euronet Worldwide Inc 10/24/2021 — 6 -8.49%
EXEL
Exelixis Inc 10/24/2021 — 6 0.51%
FDX
FedEx Corp. 10/24/2021 — 6 1.09%
FR
First Industrial Realty Trust Inc. 10/24/2021 — 6 -0.12%
GRMN
Garmin Ltd 10/24/2021 — 6 -13.05%
MNST
Monster Beverage Corp 10/24/2021 — 6 0.32%
HSTM
HealthStream Inc 10/24/2021 — 6 -5.97%
HOLX
Hologic Inc 10/24/2021 — 6 -0.41%
NSIT
Insight Enterprises Inc 10/24/2021 — 6 0.47%
IPAR
Inter Parfums Inc 10/24/2021 — 6 20.27%
INTU
Intuit Inc. 10/24/2021 — 6 5.96%
IONS
Ionis Pharmaceuticals Inc 10/24/2021 — 6 4.70%
JJSF
J & J Snack Foods Corp 10/24/2021 — 6 -0.57%
JNPR
Juniper Networks Inc 10/24/2021 — 6 4.87%
KIM
Kimco Realty Corp 10/24/2021 — 6 -1.57%
LEN
Lennar Corp 10/24/2021 — 6 -0.57%
LAD
Lithia Motors Inc. 10/24/2021 — 6 -5.66%
LPSN
LivePerson Inc 10/24/2021 — 6 -5.75%
MDC
M.D.C. Holdings Inc. 10/24/2021 — 6 -1.63%
MAC
Macerich Co (The) 10/24/2021 — 6 -1.79%
MANH
Manhattan Associates Inc 10/24/2021 — 6 7.65%
MAR
Marriott International Inc 10/24/2021 — 6 3.41%
MTZ
MasTec Inc. 10/24/2021 — 6 0.67%
MCY
Mercury General Corp 10/24/2021 — 6 -3.78%
MMSI
Merit Medical Systems Inc 10/24/2021 — 6 -6.97%
MTH
Meritage Homes Corp 10/24/2021 — 6 4.42%
MSFT
Microsoft Corp 10/24/2021 — 6 7.26%
MSTR
MicroStrategy Inc 10/24/2021 — 6 -0.48%
MCRI
Monarch Casino & Resort Inc 10/24/2021 — 6 7.39%
MNR
Monmouth Real Estate Investment Corp 10/24/2021 — 6 -1.36%
FIZZ
National Beverage Corp 10/24/2021 — 6 -0.91%
NHI
National Health Investors Inc. 10/24/2021 — 6 -4.20%
NTCT
NetScout Systems Inc 10/24/2021 — 6 0.45%
NBIX
Neurocrine Biosciences Inc 10/24/2021 — 6 1.81%
NUS
Nu Skin Enterprises Inc. 10/24/2021 — 6 -2.24%
NVDA
NVIDIA Corporation 10/24/2021 — 6 12.50%
ORCL
Oracle Corp 10/24/2021 — 6 -2.35%
OSIS
OSI Systems Inc 10/24/2021 — 6 -2.04%
PAYX
Paychex Inc. 10/24/2021 — 6 0.14%
CNXN
PC Connection Inc 10/24/2021 — 6 5.86%
PEGA
Pegasystems Inc 10/24/2021 — 6 -7.69%
PENN
Penn National Gaming Inc 10/24/2021 — 6 -4.06%
PLAB
Photronics Inc

You can’t use the current holdings of a fund for backtesting, you can only use these going forward.
You have to use snapshots of historic point-in-time holdings for backtesting, which is not easy to obtain.

The periodic snapshots of stock holdings with their dates are then transferred into a P123 Stock Factor file which can be used for backtesting.

Thanks Chris319. I agree that my universe seems to give very good average returns. I’m testing out a little different compositions of the universe.

Regardless, there is some reason why you would use Between (RankPos, 1, 60) instead of,

rank <60
Ranpos> 60 ?

Thank you, SpacemanJones, for your detailed and informative response. It is always beneficial to have input from someone with more experience.

I’ve just been a member for a little under 6 months, so it’s been a steep learning curve. I am appreciative for all of the suggestions and reads anything I come across in order to avoid the worst mistakes. :slight_smile:

Because you could have, say, Between (RankPos, 10, 20), thus eliminating the top n stocks.

Agreed. OP hasn’t explained how this universe is constructed.

I use my own ranking systems and spend some time now and then to look for improvements. The last year I’ve been updating my live systems every 2-3 months. The changes are usually quite small and often don’t lead to selling any of my open positions.

I suspect some of the reasons the public systems are not very good are:

  1. It’s really hard to build a system for large caps.
  2. Sharing smallcap systems is a bit dangerous: if many people with deep pockets decide to implement the system you’ve shared, you end up fighting over the same stocks at the same time.

The good news is that most of the components you need to build a good system have been shared (in the forum, blog, literature), you “just” have to assemble it yourself.

I’m still a beginner, but if I had to summarize what seems to have worked for me (most of it learned from the forum and the blog), it’d be something like this:

Be careful using buy/sell-rules, keep the list as short as possible. Instead focus on the ranking system. I wouldn’t worry about overfitting - instead, worry about fitting in a robust manner. That is, make sure that you use as much data as possible when you optimize your ranking system. Use rolling screens with 100 stocks, and then use rankpos>something and do more rolling tests to dig deeper into the universe.

Make subuniverses using mod(stockid,N)=0,1,2,…, and test your ranking in each one using the rank performance module & simulations.

Make random subset of the universe using “random>.5” (or some other limit) in your rolling screener. Perform multiple runs as an extra check for robustness.

If you have access to simulations: use the “Add from previous runs” button on the “restrict buy list”, to rerun your simulations removing all the stocks bought in the first run. Do this several times, removing layer after layer of stocks, and see if your ranking is able to “sort” the remaing stocks in a good way.

Do not test/build a system only considering the top 20-30 stocks, this will lead to overfitting.

Also, be aware that while you have a decent chance of beating the market follow this path (IMHO), you will not get as good live results as your backtests indicate.

How so?

Like Georg Vrba (geov) mentions above, you probably have survivorship bias in the set. Most of those seem to be companies that were successful during the test period and selected after the fact. For example, I think GoPro was a founder led company, but isn’t on the list. How many failed or lower performing founder led companies should be in the list that aren’t because they were not the successful ones during the test period? Or why is very successful founder led Ubiquity UI not on the list, etc? Now if that list of companies was selected in 2010 prior to your test period (As Georg Vrba (geov) mentions above) then it’s a different conversation.

This model uses the Ranking System Core: Sentiment (This is the standard P123 ranking system.)

I created a dynamic custom universe like this one with stock factors:
EVAL($Sentiment|$CARP>1.8, $$ARKKnew | $$FIFNX, $$VDIGX)

Where $Sentiment and $CARP are formulas for the Consumer Sentiment and the Cyclically Adjusted Risk Premium market timers, resp. https://imarketsignals.com/2021/consumer-sentiment-and-the-cyclically-adjusted-risk-premium-work-together-as-a-profitable-stock-market-timer/

$$ARKKnew, $$FIFNX , and $$VDIGX are point in time stock factor series which I periodically update. So, when Risk-Off is indicated the universe becomes that of conservative mutual fund VDIGX. The model has a fairly high turnover because it sells all the $$ARKKnew and $$FIFNX stocks when condition Risk-Off is signaled.

This model has a high annualized return and should continue to do well if Cathie Wood (ARKK) gets her mojo back.

10-stock model stats from 2016:
Number of Positions 10

Period 01/01/16 - 10/31/21

Benchmark S&P 500 (SPY)

Quick Stats as of 10/31/2021
Total Return 516.20%
Benchmark Return 150.53%
Active Return 365.67%

Annualized Return 36.59%
Max Drawdown -27.88%
Benchmark Max Drawdown -33.72%
Overall Winners (390/646) 60.37%

Annual Turnover 1,048.58%
Sharpe Ratio 1.73

Well, with the disclaimer that I’m not even remotely an expert, this is just my not-very-confident opinion: I think you might easily end up with too many fitting parameters vs the amount of data you work with.

The performance of a single stock is mostly unpredictable. If you fit a ranking system with tons of nodes to a very small set of stocks, your final backtest will look great, with stocks with extremely good returns. The problem is that you might have you fitted your system to noise.

The more stocks you use during fitting, the more able you are to cut through the noise and fit your ranking system to the “predictable” part of the future return.

To take an extreme example:
My model fitting routine the first few months as a user here at P123 was based only on simulations. The model had 20 (or 10? I can’t remember) positions, a growing list of buy/sell-rules, and a expanding ranking system. I would change something, rerun the model, see if the alpha improved, change something else, rerun, repeat. After a while my alpha was 20ish % over 10 years and I was more than a little happy :slight_smile:

Every single simulation used to fit the model started on the same date: 1/1 on the first year I had access to. Then one day I accidentally reran the model from 2/1 on the first year. The 10 year simulated alpha dropped from 20ish % to -4 %. Yikes! Fortunately I did not have much real money invested, but this was a bit of a wake up call.

Of course, this is a dumb example, had I known about the rank performance or rolling screens, I probably would have realized that something was wrong earlier. But this is how fitting to noise looks like.

I find that No OTC is a good place to start.

It’s not that complicated.

I just wanted to say that this is one of the best and sanest approaches to backtesting that I’ve encountered on this forum. Even I learned from it. Thank you. (This is just my personal opinion, not P123’s.)

Thank you test_user for providing such a thorough and well-thought-out response. Thank you for making it so practical. Such comment helps.

In my backtesting, I often use 20 shares, but I can test out 30-40 or 50 stocks. I’ve done the rolling backtest, but not nearly enough.

But I’m pretty sure that I have the issue that several have mentioned above. I only have founder-led companies that are active today, therefore there will be a substantial survivorship bias; (

I will continue to work on the strategy, but I welcome all the feedback.

Thank you Yuval! And thank you for being so generous in sharing ideas and advice on the blog and forum - a large part of my workflow is based on your insights. I’ve seen a clear improvement in live results after following your lead.

Whycliffes, thanks! I hope it wasn’t too much, it might seem overwhelming when you’re starting up. Simulations/rolling screener/rank performance complement each other nicely, used together they are a powerful tool.

The founder-led focus sounds very interesting, but if you end up with a small universe, it might be hard to properly test and build a model (in addition to survivalship bias). I’ve spent some time trying to build a ranking system for utilities, but there’s just too few utility companies for me to be confident in the model I end up with.

Good luck with the model building!

I know at one point P123 had come up with functionality of an inlist of sorts, where you could have stocks and the dates they were in the list and dates they were out.

I forgot what it is called now, and is there a tutorial fro how to use such a functionality? eg AAPL from 1/1/01 till 6/504 and IBM from 2/103 to 6/3/17 and so on. Current inlist is just a list active as of simulation date.

Go to Research / Imported Stock Factors. Give your factor a name. Then upload a CSV file with the following format: in the date column put the starting and ending dates; in the value column put 1 for in the list and 0 for not in the list; in the ticker column put the ticker. Make sure to use the correct date format.

Let’s say your imported stock factor is called inlistxx. Then add to your universe rules simply $$inlistxx.

That should do the trick. If it doesn’t, please let me know.

Here is something I found:

No ranking
No rules
No slippage
No carrying costs

If I set universe = benchmark, all of the stocks in the benchmark should be picked and the annual returns should match, correct? In other words, the screen should pick all of the stocks in the NASDAQ 100 and it should match the index.

Here is what I get with universe and benchmark both = NASDAQ 100

Screen: 9.19%

Benchmark: 10.24

Why the difference? Does it have something to do with dividends?

There could be various reasons for the difference:

  1. Weighting. Screens rebalance to equal weight at the end of every period. The NASDAQ 100 is cap-weighted.
  2. Reconstitution schedule. The NASDAQ 100 gets reconstituted every December; it’s possible that your screen’s reconstitution schedule is slightly different.
  3. Missing or extra stocks. Because of data vagaries, sometimes we have slightly fewer or more than 100 stocks in the NASDAQ 100.

thanks Yuval! I will try that out over the weekend