I think to get the same rank perf and simulation results you would need to match the settings which means same fixed slippage, same rebalance period, universe, and the same sell criteria. This means that the simulation sell criteria needs to be the same as the portfolio size: 25 stocks = sell RankPos > 25. But, if you use variable slippage you will see a lot of performance loss driven from the higher turnover the tight sell rule will create. Instead I look for relative performance between ranking systems with more realistic slippage and sell rules.
Ranking perf vs simulation:
- Buckets/portfolio size: 20 buckets and 20 stocks. Keeps it simple and I am just comparing relative performance, not absolute performance so its ok the 20 buckets have more stocks than the simulated strategy
- Universe: similar rules you have, but I split it using evenID for training and validation so I can use the same time period for training and validation
- Slippage: 0.2% slippage ranking test and variable on strategy. This can contribute to the difference if you have a lot of trading, but with $100k average daily trading I would not expect it to be significant.
- Sell criteria: ranking perf is at the portfolio “size”. I set my simulation to 2X the portfolio size so rankpos > 40. 2X may not be ideal as it forces the turnover up, but it makes it more comparable to the ranking results.
- Rebalance/period: same rebalance and period. Weekly and 10 years. This seems like the best way to be consistent between the two
- Better criteria: I am looking for consistent high return. So high alpha and low negative standard deviation
The interesting thing is that for two ranking systems I am comparing the training sub-universe shows consistent relative returns on the ranking perf and the simulation. However, the validation sub-universe and total universe show opposite ranking perf results and simulation results. Even stranger is that the system that ranked higher on the ranking perf and lower on the simulation has the lower turnover in the simulation which is part of what I expected to drive the difference.