Using the settings in RS to get a similar performance as simulator

There is sometimes a large discrepancy between the results I get from a system in RS performance and when I use it afterwards in the simulator.

I need some hints on how to get these two tests more identical in their performance (I know it’s impossible to get them identical):

Here is what I have done.

In my simulator, I have:

25 stock portfolio

Average holding is 101 days, and turnover is 300%

Universe is USD Canada

Volume is median (91) > 100 000, and Price > 1

10 year test (and use the other 10 years as out of sample test universe)

In my rank performance, I use:

Same universe

The same volume and price rule applies (But then I use it in the universe)

8 week rebalance

Bucket 70 (so each bucket has 70 stocks The reason is to compensate that i cant set variable slippage and hi+low avg price)

10-year test

Minimum price:1

The main problem is the rebalancing period to try to get some equivalent testing between Rank performance and the simulator. I have set it to 8 weeks, but I am not sure if this is the best one.

Any idea how to get these tests more alike?

What are your sell rules in the simulator? I can’t see making these two things match.

For example, consider a 1 stock portfolio. You monitor it weekly in a strategy but it only triggers on average once per year. Let’s say June. If you put this in the ranking system and set reconstitution to 1 year, it may trigger every January which is 6 months off.

Also, in your simulator do you have a ‘force in universe’ rule? This will create a difference as well.

The only way I can see getting them to be the same is if you have a sell rule of ‘1’. And then use dynamic weighting with no min or max position limit. If you use 100 buckets in RS then set your buy rule to rank>=99 in simulator. Scale accordingly to positions but you also cannot set arbitrary 25 positions and have it match because the universe is not static and # of positions always changing.

Make sure you have the same start date and same re-balance frequency.

In the end, you are simply making your simulator behave more like RS back-tester and not the other way around.

What I like to do is start with ranking system backtest and then work with the simulator to add in more intelligent buy/sell rules and lower turnover. Hard to work from simulator to RS backtest in my view.

Do you have all your buy rules in your universe? Or at least most of them?

Why would you want them to be identical?

The RS performance is very tied to a starting date if you’re doing an 8-week rebalance. If you run it on different starting dates a week or two apart you’ll see different results. The RS performance also doesn’t take into account slippage or sell rules. It’s a very different beast. You’ll also see very different factors getting very different results whether you use 1-week or 13-week rebalancing. For example, mean reversion performs wonderfully on 1-week rebalancing and not so well on 13-week rebalancing. The bucket performance is meant to illustrate the entire range of results for your universe. It’s a research and learning tool. The simulation, on the other hand, is tailored to your specific portfolio management system, with the RS just a small part of it.

No, I only have the price limit and volume rule in the universe. And in my simulator, I dont use any sell rules besides < rank 98 Why?

Yes, I agree. One of the reasons why I want to try to have settings in the RS that are more like the settings in the simulator is because it seems that the really good systems out there focus on getting good factor nodes and good weighting in the RS. And one of the ways to achieve this is to mass-test the RS nodes in combination and with different weightings. The simplest way to do this for those who can’t use Phyton or other advanced automation tools is to use Dataminer and the RS optimisation. But both can only do this with RS, not the simulator.

So by having the settings in the RS that, in the best way possible, replicate what your simulation would do, it will be the simplest way to mass test the nodes in combination and find a good weight for each node. And after doing this, do some stress testing to find the robustness of the system.

That’s the reason I’m looking for the best settings in the RS that make it comparable to the simulator.

Yes, it’s true what you say about the 1 vs 13 weeks rebelance. What would you use to try to replicate a simulation that has a turnover of 300% each year? (I know it cant bee a full replication between RS and the simulator, so here im just looking to find the best setting.)

But would not a weekly rebalancing create a massive turnover that you don’t have in your simulator? I have my settings in the simulation set to weekly, but my sellrule < rank 98, makes the turnover in the simulator 300%.

But with a 1-week rebalance in the RS, it would pick the very best-scoring stocks each week, not like your simulator, which would allow the stocks in your portfolio to fall in score from, say, 100 to 98, but the RS performance test only holds 100-scoring stocks.

Yes, I agree, and that is not what I want. The reason Im trying to get the RS setting more like the simulator is the masstesting possibilities in RS Optimizer and Dataminer of node performance, node combination and node weight

I think if you have as many buys rules in your universe as possible, the RS performance would more closely follow your sim. But in my opinion the best reason to look at the performance tab is to make sure your RS has a nice linear slope to it.

But what kind of buy rules can you have in the universe aside from price and volume so that you can get much of the same results in the RS and the simulator?

And what rebalance period would you use in the RS, when the simulator has 300% turnover?

Yes, I agree with the “slope” argument.

So just a simple question: You are implying this is linear (“make sure your RS has a nice linear slope to it”.)?!

Maybe you want to call that linear with one absolutely HUMONGOUS outlier at the end? Maybe……

Does anyone else see an outlier at the end there? Maybe it is just me. Just ignore a literally 100-sigma event—without exaggeration–and call it good enough!!!

You work for the government or something to ignore a 10-sigma event and call it a day? Nothing to see here. Keep your benefits (after the destruction of a 10,000-year flood using a government analogy)? :thinking: :rofl:

I choose to at least consider non-linear methods that would not call that an outlier but rather a feature to be exploited. That bears repeating: A FEATURE TO BE EXPLOITED FOR MAKING MONEY IN THE MARKET!

And with the data I have now, a random forest does do better than linear models that use slope with 5-fold cross-validation of the method. Learning from what might be a truly non-liner relationship after all. Suggesting that maybe this can be exploited to make a better model.

It did make it easier for P123 to just ignore that in the past. I get the impression that Marco is embracing techniques that could potentially exploit this “anomaly.” By using ML/AI that is not always linear I mean. Marco SAYS he wants to do that. A simple fact: A convent method to rebalance a non-linear model and exploit this anomaly using DataMiner downloads is not available now, however.

I do think Dan understands the use-case and is now working on this. Tip-of-the-hat to Dan and P123 for recognizing the potential here. :pray: :pray: :pray:

It does not cost much, if anything, to make the data available as a download with DataMiner. And it would increase revenue by increased DataMiner and API usage. I am not saying Dan does not recognize this already.

Jim

So is alpha hiding in the residuals?

I took a simplified version of a long-running ranking systems through rank performance w/ slippage = 0.50%. Attached are plots for three rebalance periods; 1wk, 4wk, 13wk. The final panel is from a sim optimizer run.

Eyeballing (on pun, Jim) the results, it appears that the residuals for the highest bar are 16%, 10%, and 6%, respectively.

I found the 13wk performance so surprising for a micro/small cap model, I had to run a sim. Sure enough the equity curve is respectable.

Interesting stuff.




:rofl: :rofl: :rofl:
Oh, I SEE what you are saying. Seriously, I am ALWAYS using visual analogies. What is with that?

Short serious example. I was chatting with ChatGPT and said there is obviously a place for the concept of RESOLUTION in what I am doing (a visual analogy).

Resolution, like turning up the microscope magnification so high, in medal school (or in surgery for that matter), that everything just becomes grainy and less clear. You do get more magnification but at the cost of resolution and you can only stand so much if you want make out anything. Analogous, I think, to trying to use a model to look at the effect of factors with too much magnification.

Specifically, not that cool to put the minimum leaf size on a random forest to 1 for whatever reason(s). You can see the veins and cells on one leaf with this magnification, but you miss the entire (random) forest by doing that. So something said in a different way in common parlance (You cannot see the forest for the leaf you are peering at). So, really just common sense, I think.

BTW, ChatGPT said I had a great “innovative and novel” idea. And that maybe I should publish the idea. But (for some) just common sense. So either completely obvious for some, or something some will never get no matter how hard you try.

So just another AI hallucination, I think, that this could ever be published. Hallucinations are visual, right??? Help me!!! I cannot stop.

Thanks. I get the joke and it is a good one with a lot of truth in it. :slight_smile:

Jim

So by having the settings in the RS that, in the best way possible, replicate what your simulation would do, it will be the simplest way to mass test the nodes in combination and find a good weight for each node. And after doing this, do some stress testing to find the robustness of the system.

That’s the reason I’m looking for the best settings in the RS that make it comparable to the simulator.

Yes, it’s true what you say about the 1 vs 13 weeks rebelance. What would you use to try to replicate a simulation that has a turnover of 300% each year? (I know it cant bee a full replication between RS and the simulator, so here im just looking to find the best setting.)

Ah. I think a much better alternative to simulations is to use the DataMiner RollingScreen. That’s what I use for a lot of my optimization. It’s terrific because you can specify a long holding period and have it test it every single week on different universes. It’ll do hundreds of iterations on dozens of ranking systems really quickly. Include Results: yes will give you a ton of valuable information. For a 300% turnover you’d want your holding period to be 365/3, or 122 days.

1 Like

I agree with this 100%. DataMiner and P123 are wonderful for a lot of different methods, Yuval’s, mine, and I will leave it to others to comment on their particular methods.

Not everyone will use RollScreen all of the time for everything, I would think, and I assume no one expects everyone will use this exclusively.

PersonaIly, I am very please that I can now get overnight updates on factors thru ScreenRun with DataMiner (without a 500 row limit)—allowing rebalances of some algorithms that cannot be put back into P123’s ranks during the week.

Any augmentation to DataMiner over time would be very much appreciated. i hope P123 is open to some ideas that would be used by many (possibly by machine learners is some instances). I think Dan is looking into a suggestion by Jonpaul in this regard.

Nice! DataMiner has a lot of great uses and I appreciate Yuval pointing one of them out for us.