Boosting your returns

Ole Petter,

Thank you for your interest.

So the simple answer to this is yes it can be done pretty easily but it could be better. P123 could add A LOT along the lines of what Steve is working with Marco on with regards to DataMiner. So just a hat tip to what Steve and Marco are working on.

But for now the best way to get data is through the sims, I think. You can get about 20,000 buy transactions at a time through a sim. This dwarfs anything DataMiner can do for now.

So let me ask: do you have a level of membership that gives you access to sims?

One of the problems with the screener is that—whether you use Python or a spreadsheet—it is a bit of a nightmare matching the returns, the returns of the universe (or the sim each week), the excess returns and rank of of your system and then the rank of each in individual nodes or factors and match them up. You can do it with sims but just barely.

So if you have access to sims with your membership “all” is the ranking system for the sim shown above. Factor1 is the rank of a node in the ranking system for the sim. The other factors (Factor2 and Factor3) are just P123 factors (not functions). But you could do this with functions too.

The ranks for the factors are obtained by simply putting 100% weight on Factors (or a node) in the ranking system and 0 for all of the other weights. Repeated until you have done this will all of the factors, nodes or functions.

I think you will need this in a sim and it is not immediately obvious. You need something like this in the buy rules: portfolio(“MLFactor”).

This is the only thing that will keep all of the different sims you run synced up so you can easily concatenate them (whether in Python or a spreadsheet). You run 4 different sims here. One using the optimized ranking system, 3 others with 100 weight on one Factor (or node). In other words on Factor1, Factor2 and Factor3 using portfolio(“MLFactor”) to keep them synced up in this example.

And as I mention above, I think you have to use this: “sell rule: 1, Force Positions into Universe: No, and Allow Immediate Buyback: No.” Otherwise, “buy/sell difference” with each rebalance messes everything up and you have to manually remove each one. And the label can be for more than one week really screwing things up.

Anyway, if your membership allows you to use sims I can get you up and going with this. Please let me know what I can expand upon.

For screens you have to do one week at a time and P123 will cut you off after 5 weeks even though you are using just ranks.DataMiner might not cut you off at 5 downloads but you can only download one week at a time and you will have to figure out a way to get excess returns.

This will not work with raw returns in my experience. The data is too noisy—fluctuating with every change in oil price, Fed move or Trump tweet.

Anyway, my advice is to not waste your time if you cannot get excess returns. And I do not think a cap-weighted benchmarks will cut it.

Hope this helps some. Sorry for the length of this post. But there are quite a few tricks to getting this (with my method at least). And I probably did not cover them all and probably was not very clear on the ones I did cover as it is.

But once you get the tricks you can do machine learning at P123. You do not have to follow Marc over to Chaikin Analytics to do machine learning;-) Isn’t it ironic?

I am trusting that Marco will not block this method after I responded to his request to learn how to do this himself. I do not think P123 wants to block data. They just do know know their potential yet. That is my hope anyway.

Best,

Jim

Thanks again Jim, I have access to sims and I was aware of the portfolio() function, but I would never have thought of using it that way - brilliant! When I find the time I will try to replicate your analysis with my own ranking system.

Ole Petter,

Thank you!

Please contact me on the forum or by email if I can help at all.

Also, if you want P123 to streamline this in any way you might consider contacting Steve Auger by email.

For whatever reason, Steve has been able to capture Marco’s attention on this. And the combined programming skill of Steve and Marco are out of this world.

I think Steve (or I) can share some code for XGBoost and/or TensorFlow if you are interested.

And Steve is seeking to form a group to avoid bothering people who have no interest in this.

There is a lot more that can be done with this. Stuff that is done everywhere but here: like screening the entire universe for a large number of factors using the Feature Importance mentioned above.

One can rationally argue how useful Feature Importance really is. But de Prado is clear about this in his book:

[i]“Backtesting is not a research tool. Feature importance is." [/i]

de Prado, Marcos López. Advances in Financial Machine Learning (Kindle Locations 3402-3404). Wiley. Kindle Edition.

Actually, P123 now agrees that backtesting has limitations.

It is not clear what they see as the best alternative or how that will evolve. But again, de Prado is clear on this.

In any case, whether it is about feature importance or anything else make sure to contact Steve or me.

Best,

Jim

In a few weeks time I will present my method and design for an AI-based indicator for current quarter surprise for a subset of software stocks. My objective is to generate interest in use of P123 in conjunction with AI. The indicator design will be presented here as a series of posts, unless Marco creates a separate platform/forum specific to ML. Ultimately, you should be able to do everything with a few lines of code plus the s/w library that I am working on polishing up right now. To get maximum benefit from my posts, readers might want to brush up on their Python skills. I found this to be a good site for reference: https://www.w3schools.com/python/default.asp

Python is a pretty easy to use programming language. If you are already a programmer it won’t take long to pick it up and use.

I will be using Google Colab as the development environment and Google Drive for file storage and retrieval. The advantage of Colab is that you don’t have to muck up your PC with all sorts of installations that often result in strange effects on the functioning of your PC. Users can use their own development environment but will have to tailor any code I provide to accommodate storing and retrieving data files and importing libraries.

Also, xgboost will be the primary AI engine: https://xgboost.readthedocs.io/en/latest/python/python_api.html
I will also provide a tensorflow interface, but training is much slower and the results not as good.

Steve

I’m just starting to look at this issue but it seems to me that dumping ranks (top and sub-node(s)) should be easy and relatively inexpensive for p123. For a sim, those values need to be computed anyway so the dumping them along the way is the only additional step. Disk storage (file size) and sim bottlenecks (disk IO) may be issues, though. I would hate to see date collection get overly complicated.

Steve knows what he is talking about here.

The above demonstration with JASP was done in about an hour with the time mostly spent on writing and screen shots. And a little time with JASP.

Normally one would spend some time adjusting (and validating) the hyper parameters in a Boosting program.

The only hyper parameter I changed was “Shrinkage” to 0.01 (based on previous experience with boosting programs). I also changed the Training and Validation Data to K-fold with 5 folds which is not a hyper paramater. That was all the time that I spent. I did this before I saw how the test sample performed.

I thought my point was already made. And that no one would claim that changing these 2 things from their default settings was just too hard for a serious investor.

Anyone wanting to spend more time with JASP should also change “Min. observations in node” and “Interaction depth.” The defaults that I used here are almost certainly not optimal. And the optimal hyper parameters will be different for different data (including your data).

The real time that I have spend with boosting has been with XGBoost which is the program professionals use and it does offer some additional capabilities. But is XGBoost better than a neural net as Steve says?

Steve’s opinion of neural nets is shared by many. Here is a quote from the web. I do not think it is from a famous person but the same quote can be found everywhere:

“Xgboost is good for tabular data………whereas neural nets based deep learning is good for images….”

“Tabular data” is just what is found in an Excel spreadsheet.

I actually disagree with this blanket statement. TensorFlow can beat XGBoost SOMETIMES, I think.

But XGBoost is the place to start. And Steve is using TensorFlow too.

If you just want to make money you should see if Steve has something you can use.

I have limited experience with one model only. But from what I can see, xgboost is far superior for the type of application that I am developing. Either that or I have been fooling myself into believing that what I am doing is correct. One of the two. Anyways, when I get around to presenting what I have, hopefully it will be peer reviewed by the scientific minds here (I think there are many hiding in the shadows). I don’t mind getting a little egg on my face if there is something I am doing wrong. It will save me some grief down the road.

Interesting twitter thread. [url=https://twitter.com/RobinWigg/status/1331168066177294336]https://twitter.com/RobinWigg/status/1331168066177294336[/url]

1.0039 ^52 = 1.22… where are you getting a 50% excess return from? (The P123 chart has annualized return of 49.5%?)

Thanks Philip,

“greater” excess return.

(0.039/ 0.25 - 1) * 100 = 56%

Admittedly a medical way to look at this as in “people on statins have a .0001% chance of dying while those on placebo had a 0.00015% chance of dying” A fifty percent increase in deaths for the placebo group.

Despite the obvious problems with this we keep talking that way.

You ask a great question. One I did not really think about until posted this: is this significant in a “clinical” sense.

But it is a good question and an obvious one. So soon after posting I ran these numbers: (1 + (0.0039 - 0.0025)) = 1.0014

1.0014^52 = 1.0755 or 7.55%

Thank you for expanding on this. This is also something that is endlessly discussed in medicine. Should you take a statin? Should I go through the extra work of Boosting?

So I like your way of looking at it. And perhaps 7.5% is the number we want.

Meaningful? I think so. And I think one can do better with just a little more work. Especially with XGBoost.

But I would be interested to see what others find with their ranking systems. And see if they think what they find is meaningful.

This was meant as just a simple, first-look at Boosting that most people could do on their own (although even this is not exactly easy). For me personally, i have found much more meaningful numbers are possible.

I think with Marco’s API releases and Steve’s sharing of code members can see how much more potential XGBoost might be able to provide for their own systems and not have to trust me on any of this.

Thanks.

Jim

@Jrinne
Did you take into account transaction costs + slippage? Because your portfolio turnover of 5000% is a true killer. I wouldn’t be surprised if your returns were down the gutter / poor after properly taking into account transaction costs + slippage.

You’re missing the point. Your P123 chart says you achieved 50% annual return. The Benchmark do 50% over the entire time. Your claiming a 0.39% weekly excess return. That doesn’t equate to a 50% annual return. Add 0.39% weekly return to any benchmark you want (SPX, SP1500 Value, etc) you don’t end up with a 50% annual return. Somethings wrong.

Just to clarify what you’ve achieved, you used JASP’s boosting technique to optimize weighting for three factors (which happen to be composites, but JASP only saw the three factors), which increased your weekly excess return from 0.25% to 0.39%?

My apologies if I am misleading.

So the tickers are every single trade in the P123—exactly 25 trades every week.

For my study the excess returns are excess relative to the sim. I cannot stress enough how one has to get the noise of the market out of the data.

So if one week ticker ABC happened to have 0.39% excess return as I use it here this that would be in addition to the return of the 25 stock model.

Is that responsive at all?

I have to go for a while. But please ask about anything.

I am pretty sure that Boosting did about 7.5% better annualized with this simple model than if you picked stocks based on rank. I think I can explain that (or apprreciate the correction if I missed something).

Jim

Ahhhh ok so let me go again: You had a model that was decent and already did say a 40% annualized return. You then used JASP to tweak the weightings for 3 composites. The new tweaked model had an annualized return of 50%.

Is that right?

This is not a sim for trading.

This is a sim for getting data.

Just as the API Marco will provide gives you factor ranks and returns.

If Marco is smart he will not add any noise to that data with slippage.

You/he will have to work out the slippage later.

This is a method to get data and only to get data. Data used to train boosting (or TensorFlow).

Jim

Quantinomics,

The weekly trunover is intentional for collecting data for boosting.

The target (label) should be over the same time-period. One ticker where the target (label) is 1 week’s return and another were the target was 6 week’s return does not work all.

Yes. Exactly.

And what I did with JASP was not a serious attempt. One can do better.

Thank you Philip.

Hi Jim, after my last post on data leakage, you asked me where I come from and when was my last ML project, but please consider reading it again and possible consequences in your process. I had a quick look at JASP online materials (without testing the software). It seems the data splitting process is random: you just specify a percentage for the test set and a validation method (maybe some parameters can override randomness: possibly the “test set indicator”). As the “test set indicator” is set to none in your screenshot, it is likely that both the test set and the validation subsets used in the k-fold validation are all picked randomly in a unique time period and a unique ticker universe. Please check this into the full doc of JASP, because if it is really done this way, it means your training set, your k-fold validation subsets and your test set are randomly intertwined in time and in ticker universe, whereas they should be independent in at least one dimension of the double index (date, ticker). It may result in massive data leakage. Results may be very misleading if I am right in interpreting what JASP does. Moreover, your P123 simulation may also include the same time period (?).
You should look into this “test set indicator” parameter to see if it is possible to segregate the training/validation/test sets in different time periods and/or different ticker subsets.
If after looking into this it is not possible to split data in time periods and/or ticker subsets, JASP (in its current version) may not be the right tool to deal with financial timeseries

I did a similar mistake at the beginning of my “last ML project”: I used the default “random splitting” parameter of a sklearn function and figured it out after a few days. After correcting that, my super model with a flawed 80% predictive power appeared to be much less attractive.
This was just random thoughts from a guy living in the third world with no professional experience in ML but who wants to help.

Frederic,

You certainly make some points worth considering.

So first, if one wants to do your way: JASP gives you the ability to pick and mark the data that you want to be in the test set. And this is how I usually do it with my studies.

I chose the simpler approach for illustration. While simpler, I believe it is valid and is the preferred method for some situations.

So one can, if they want, train and validate on earlier data (say 2005 to 2015) and test on data from 2015 to 2020. But I disagree that one is always required to do it this way.

I do agree that one should do this with time-series data. And the “embargo period” that you mention is also a good idea. I assume you know what I mean by “embargo” but perhaps you call it something different.

Mine was NOT time-series data.

I completely disagree with what you are saying for a cross-sectional study on data that is i.i.d or ergodic and cannot imagine where you would get that.

For example, if a company wants to test whether a web-page that is green gets more clicks than a web page that is orange do they have to separate-out the test data to be after the training and validation data in time?

Simple answer: no. And again I do not understand where you would get that.

The company could test their web-pages with exactly the same method I used—using JASP if they want.

I think a full discussion of this is beyond the scope of this thread. [b]But I think the people who designed JASP kind of knew what they were doing when they made this method an option for cross-sectional studies.[/b]

But if you prove me wrong maybe we can call JASP–as well as Amazon and a few other operators of web pages in the Silicon Valley–and tell they how they should be doing this.

P123 can hire a professional to supervise this if they want but I do not think some committee from the present forum has the understanding to supervise how others use this in any intelligent manner.

That was my only point in my previous post and it seems even more true now.

Jim

Delete or edit I guess. I had a separate point. But I think I will stick with the idea that the present forum is not capable of deciding this.

For sure, I am not applying for the job of moderator of machine learning in the forum. I am was glad to try to help out (when asked) as I did when I shared my knowledge of Colab and XGBoost and TensorFlow (and Anaconda which Steve did not appreciate much compared to Colab) with Steve—and I guess Marco too as he said: “I’d also want to try it myself.”

I think Marco and others have an easy and valid first-way to do this for cross-sectional data. I am not sure whether Marco has moved on to XGBoost yet or not. If so, it is probably with Steve’s help.

People can modify what I have done above to their heart’s content. Expand upon it. Decide for themselves.

And Marco, if you are considering putting some committee from the present forum (as it is now) in charge of this you should just shelve this project. It will not end up being worth anything and it will be a waste of everyone’s time.

And Marco, thank you. We do not have anyone c*ssing and calling everyone a quant yet. Or lecturing us about how the Theil-Sen estimator is the only acceptable quantitative method. BTW, I do not see you stopping anyone from using the Theil-Sen estimator if they want. I know you had a lot to do with that. I think you will find that to be a wise business decision (or not). You can still shelve the project if you think that is best—with no complaints from me.