Boosting your returns

WalterW · November 21, 2020, 5:06pm

I’m just starting to look at this issue but it seems to me that dumping ranks (top and sub-node(s)) should be easy and relatively inexpensive for p123. For a sim, those values need to be computed anyway so the dumping them along the way is the only additional step. Disk storage (file size) and sim bottlenecks (disk IO) may be issues, though. I would hate to see date collection get overly complicated.

Jrinne · November 22, 2020, 8:43am

Steve knows what he is talking about here.

The above demonstration with JASP was done in about an hour with the time mostly spent on writing and screen shots. And a little time with JASP.

Normally one would spend some time adjusting (and validating) the hyper parameters in a Boosting program.

The only hyper parameter I changed was “Shrinkage” to 0.01 (based on previous experience with boosting programs). I also changed the Training and Validation Data to K-fold with 5 folds which is not a hyper paramater. That was all the time that I spent. I did this before I saw how the test sample performed.

I thought my point was already made. And that no one would claim that changing these 2 things from their default settings was just too hard for a serious investor.

Anyone wanting to spend more time with JASP should also change “Min. observations in node” and “Interaction depth.” The defaults that I used here are almost certainly not optimal. And the optimal hyper parameters will be different for different data (including your data).

The real time that I have spend with boosting has been with XGBoost which is the program professionals use and it does offer some additional capabilities. But is XGBoost better than a neural net as Steve says?

Steve’s opinion of neural nets is shared by many. Here is a quote from the web. I do not think it is from a famous person but the same quote can be found everywhere:

“Xgboost is good for tabular data………whereas neural nets based deep learning is good for images….”

“Tabular data” is just what is found in an Excel spreadsheet.

I actually disagree with this blanket statement. TensorFlow can beat XGBoost SOMETIMES, I think.

But XGBoost is the place to start. And Steve is using TensorFlow too.

If you just want to make money you should see if Steve has something you can use.

InspectorSector · November 22, 2020, 4:43pm

I have limited experience with one model only. But from what I can see, xgboost is far superior for the type of application that I am developing. Either that or I have been fooling myself into believing that what I am doing is correct. One of the two. Anyways, when I get around to presenting what I have, hopefully it will be peer reviewed by the scientific minds here (I think there are many hiding in the shadows). I don’t mind getting a little egg on my face if there is something I am doing wrong. It will save me some grief down the road.

test_user · November 24, 2020, 4:15pm

Interesting twitter thread. [url=https://twitter.com/RobinWigg/status/1331168066177294336]https://twitter.com/RobinWigg/status/1331168066177294336[/url]

philjoe · November 26, 2020, 3:11pm

1.0039 ^52 = 1.22… where are you getting a 50% excess return from? (The P123 chart has annualized return of 49.5%?)

Jrinne · November 26, 2020, 4:01pm

Thanks Philip,

“greater” excess return.

(0.039/ 0.25 - 1) * 100 = 56%

Admittedly a medical way to look at this as in “people on statins have a .0001% chance of dying while those on placebo had a 0.00015% chance of dying” A fifty percent increase in deaths for the placebo group.

Despite the obvious problems with this we keep talking that way.

You ask a great question. One I did not really think about until posted this: is this significant in a “clinical” sense.

But it is a good question and an obvious one. So soon after posting I ran these numbers: (1 + (0.0039 - 0.0025)) = 1.0014

1.0014^52 = 1.0755 or 7.55%

Thank you for expanding on this. This is also something that is endlessly discussed in medicine. Should you take a statin? Should I go through the extra work of Boosting?

So I like your way of looking at it. And perhaps 7.5% is the number we want.

Meaningful? I think so. And I think one can do better with just a little more work. Especially with XGBoost.

But I would be interested to see what others find with their ranking systems. And see if they think what they find is meaningful.

This was meant as just a simple, first-look at Boosting that most people could do on their own (although even this is not exactly easy). For me personally, i have found much more meaningful numbers are possible.

I think with Marco’s API releases and Steve’s sharing of code members can see how much more potential XGBoost might be able to provide for their own systems and not have to trust me on any of this.

Thanks.

Jim

Quantonomics · November 26, 2020, 4:36pm

@Jrinne
Did you take into account transaction costs + slippage? Because your portfolio turnover of 5000% is a true killer. I wouldn’t be surprised if your returns were down the gutter / poor after properly taking into account transaction costs + slippage.

philjoe · November 26, 2020, 4:43pm

You’re missing the point. Your P123 chart says you achieved 50% annual return. The Benchmark do 50% over the entire time. Your claiming a 0.39% weekly excess return. That doesn’t equate to a 50% annual return. Add 0.39% weekly return to any benchmark you want (SPX, SP1500 Value, etc) you don’t end up with a 50% annual return. Somethings wrong.

Just to clarify what you’ve achieved, you used JASP’s boosting technique to optimize weighting for three factors (which happen to be composites, but JASP only saw the three factors), which increased your weekly excess return from 0.25% to 0.39%?

Jrinne · November 26, 2020, 5:08pm

My apologies if I am misleading.

So the tickers are every single trade in the P123—exactly 25 trades every week.

For my study the excess returns are excess relative to the sim. I cannot stress enough how one has to get the noise of the market out of the data.

So if one week ticker ABC happened to have 0.39% excess return as I use it here this that would be in addition to the return of the 25 stock model.

Is that responsive at all?

I have to go for a while. But please ask about anything.

I am pretty sure that Boosting did about 7.5% better annualized with this simple model than if you picked stocks based on rank. I think I can explain that (or apprreciate the correction if I missed something).

Jim

philjoe · November 26, 2020, 5:17pm

Ahhhh ok so let me go again: You had a model that was decent and already did say a 40% annualized return. You then used JASP to tweak the weightings for 3 composites. The new tweaked model had an annualized return of 50%.

Is that right?

Jrinne · November 26, 2020, 5:17pm

This is not a sim for trading.

This is a sim for getting data.

Just as the API Marco will provide gives you factor ranks and returns.

If Marco is smart he will not add any noise to that data with slippage.

You/he will have to work out the slippage later.

This is a method to get data and only to get data. Data used to train boosting (or TensorFlow).

Jim

Jrinne · November 26, 2020, 5:28pm

Quantinomics,

The weekly trunover is intentional for collecting data for boosting.

The target (label) should be over the same time-period. One ticker where the target (label) is 1 week’s return and another were the target was 6 week’s return does not work all.

Jrinne · November 26, 2020, 5:43pm

Yes. Exactly.

And what I did with JASP was not a serious attempt. One can do better.

Thank you Philip.

piard2 · November 28, 2020, 8:33am

Hi Jim, after my last post on data leakage, you asked me where I come from and when was my last ML project, but please consider reading it again and possible consequences in your process. I had a quick look at JASP online materials (without testing the software). It seems the data splitting process is random: you just specify a percentage for the test set and a validation method (maybe some parameters can override randomness: possibly the “test set indicator”). As the “test set indicator” is set to none in your screenshot, it is likely that both the test set and the validation subsets used in the k-fold validation are all picked randomly in a unique time period and a unique ticker universe. Please check this into the full doc of JASP, because if it is really done this way, it means your training set, your k-fold validation subsets and your test set are randomly intertwined in time and in ticker universe, whereas they should be independent in at least one dimension of the double index (date, ticker). It may result in massive data leakage. Results may be very misleading if I am right in interpreting what JASP does. Moreover, your P123 simulation may also include the same time period (?).
You should look into this “test set indicator” parameter to see if it is possible to segregate the training/validation/test sets in different time periods and/or different ticker subsets.
If after looking into this it is not possible to split data in time periods and/or ticker subsets, JASP (in its current version) may not be the right tool to deal with financial timeseries

I did a similar mistake at the beginning of my “last ML project”: I used the default “random splitting” parameter of a sklearn function and figured it out after a few days. After correcting that, my super model with a flawed 80% predictive power appeared to be much less attractive.
This was just random thoughts from a guy living in the third world with no professional experience in ML but who wants to help.

Jrinne · November 28, 2020, 11:51am

Hi Jim, after my last post on data leakage, you asked me where I come from and when was my last ML project, but please consider reading it again and possible consequences in your process. I had a quick look at JASP online materials (without testing the software). It seems the data splitting process is random: you just specify a percentage for the test set and a validation method (maybe some parameters can override randomness: possibly the “test set indicator”). As the “test set indicator” is set to none in your screenshot, it is likely that both the test set and the validation subsets used in the k-fold validation are all picked randomly in a unique time period and a unique ticker universe. Please check this into the full doc of JASP, because if it is really done this way, it means your training set, your k-fold validation subsets and your test set are randomly intertwined in time and in ticker universe, whereas they should be independent in at least one dimension of the double index (date, ticker). It may result in massive data leakage. Results may be very misleading if I am right in interpreting what JASP does. Moreover, your P123 simulation may also include the same time period (?).
You should look into this “test set indicator” parameter to see if it is possible to segregate the training/validation/test sets in different time periods and/or different ticker subsets.
If after looking into this it is not possible to split data in time periods and/or ticker subsets, JASP (in its current version) may not be the right tool to deal with financial timeseries

I did a similar mistake at the beginning of my “last ML project”: I used the default “random splitting” parameter of a sklearn function and figured it out after a few days. After correcting that, my super model with a flawed 80% predictive power appeared to be much less attractive.
This was just random thoughts from a guy living in the third world with no professional experience in ML but who wants to help.

Frederic,

You certainly make some points worth considering.

So first, if one wants to do your way: JASP gives you the ability to pick and mark the data that you want to be in the test set. And this is how I usually do it with my studies.

I chose the simpler approach for illustration. While simpler, I believe it is valid and is the preferred method for some situations.

So one can, if they want, train and validate on earlier data (say 2005 to 2015) and test on data from 2015 to 2020. But I disagree that one is always required to do it this way.

I do agree that one should do this with time-series data. And the “embargo period” that you mention is also a good idea. I assume you know what I mean by “embargo” but perhaps you call it something different.

Mine was NOT time-series data.

I completely disagree with what you are saying for a cross-sectional study on data that is i.i.d or ergodic and cannot imagine where you would get that.

For example, if a company wants to test whether a web-page that is green gets more clicks than a web page that is orange do they have to separate-out the test data to be after the training and validation data in time?

Simple answer: no. And again I do not understand where you would get that.

The company could test their web-pages with exactly the same method I used—using JASP if they want.

I think a full discussion of this is beyond the scope of this thread. [b]But I think the people who designed JASP kind of knew what they were doing when they made this method an option for cross-sectional studies.[/b]

But if you prove me wrong maybe we can call JASP–as well as Amazon and a few other operators of web pages in the Silicon Valley–and tell they how they should be doing this.

P123 can hire a professional to supervise this if they want but I do not think some committee from the present forum has the understanding to supervise how others use this in any intelligent manner.

That was my only point in my previous post and it seems even more true now.

Jim

Jrinne · November 28, 2020, 12:34pm

Delete or edit I guess. I had a separate point. But I think I will stick with the idea that the present forum is not capable of deciding this.

For sure, I am not applying for the job of moderator of machine learning in the forum. I am was glad to try to help out (when asked) as I did when I shared my knowledge of Colab and XGBoost and TensorFlow (and Anaconda which Steve did not appreciate much compared to Colab) with Steve—and I guess Marco too as he said: “I’d also want to try it myself.”

I think Marco and others have an easy and valid first-way to do this for cross-sectional data. I am not sure whether Marco has moved on to XGBoost yet or not. If so, it is probably with Steve’s help.

People can modify what I have done above to their heart’s content. Expand upon it. Decide for themselves.

And Marco, if you are considering putting some committee from the present forum (as it is now) in charge of this you should just shelve this project. It will not end up being worth anything and it will be a waste of everyone’s time.

And Marco, thank you. We do not have anyone c*ssing and calling everyone a quant yet. Or lecturing us about how the Theil-Sen estimator is the only acceptable quantitative method. BTW, I do not see you stopping anyone from using the Theil-Sen estimator if they want. I know you had a lot to do with that. I think you will find that to be a wise business decision (or not). You can still shelve the project if you think that is best—with no complaints from me.

piard2 · November 28, 2020, 2:34pm

Jim, your ideas are valuable and certainly not a waste of time.
It’s also the purpose of a forum to confront different points of view in a civil way. We don’t need a police or a committee.
You want to promote ML in P123, I am absolutely on your side. But we have to be a bit careful when we write here because forum threads may be taken as a reference. In this particular case, your objective is not very explicit. You are showing parameters of JASP that may result in using ML algos for curve-fitting rather than generating a predictive model. Then you show the feedback of the curve-fitting process in a simulation. I think I understand why you do that, but it may be misleading for readers coming here without understanding all what is involved behind.

Jrinne · November 28, 2020, 3:06pm

Frederic,

If you look back at my treads you will find that I have been a strong advocate of certain people picking up a book. I kind of get that. Or even better would that P123 hire a consultant again if they want to actually be involved in how any AI/machine learning models are being used. They have done it before.

Maybe P123 could get professional confirmation as to whether a rank (as long as this “transformation” preserves the order of the data) is as good as raw data for a boosting model, for example.

I am sorry you do not get it. I do think Steve has gotten it. He has been willing to put some time into it on his own.

I am happy to help Philip, Steve and others who may want to take what I have done and expand upon it and improve upon it to see for themselves. To help them get started if they are interested. I do not plan to fully expand upon all of the possible improvements in this thread. Or go into all of the theoretical justifications for my methods here. Members can find their own (probably better) methods if they do not like mine.

It isn’t practical or even possible in this format anyway. My posts are long enough as it is. And I think Marco sees that.

This discussion is already boring me and I like machine learning!!!

I await your more complete, peer-reviewed study when you have your own findings to contribute. And in the meantime, please, take over with your own methods and findings on this forum.

Until then, I think Marco is smart to allow members the opportunity to find their own use for the API even allowing them to make some mistakes along the way.

Unless Marco wants to hire a professional consultant, it is my opinion that the “peers” (as in peer-review) on this forum are not up to it.

This could change. I think P123 can attract some people well-versed in machine learning. Some are already here (but not posting in the forum). They have contacted me by email. I am not holding my breath waiting for them to share their ideas with us on the forum, however.

Marco can continue to help them with data as long as it fits his business model, I think. Or not. I leave it to Marco.

Best,

Jim

philjoe · November 28, 2020, 5:12pm

I just need to be able to make the target column (ie. forward month returns) from the API without running out of room. I want data for a 3000 stock universe with say 20 nodes as rankings, plus the one month forward returns for each stock, over a period of 10 years. If this can’t be automated, I refuse to pull it one week at a time, ie. pull one week, reset my API key, and then repeat for 10 years.

If someone can show me that (Steve’s python code was close, but my API key keeps running out) I will happily expand on what Jim has started and report back.

Jrinne · November 28, 2020, 8:21pm

P123 would be wise to waive any fees and facilitate this. For Philip or anyone willing to share at this point.

Whatever Philip finds would be well worth any network or data costs incurred. P123 would either know to put this project on the back-burner or possibly get an early advertisement for their AI/machine learning project should they decide to continue it. Either way, well worth the investment.