Monotonic Constraints

Jrinne · February 2, 2024, 7:41am

Thank you. I cannot make boosting work without it (monotonic constraints). Not well anyway:

These are the results of a k-fold cross-validation with an embargo for various portfolio sizes. First with, then without, monotonic constraints (all other parameters the same). I used Sklearn's HistGradientBoostingRegressor here (rank normalization).

@marco and @Riki37, it would be good if all of the features of XGBoost are available when the AI/ML is released. The use of monotonic constraints may be necessary to make XGBoost functional for P123 members.

Much appreciated.

Jim

marco · February 2, 2024, 6:02pm

So each feature can be either -1,0,+1 ? So If I have 90 features the setting for monotone_constraints is a string of 90 values? Basically what we have now as Higher/Lower is better.

Obviously starts getting crazy. Surely we’d only want to apply monotonicity on some of the features , not all. Can you give concrete examples on use case with 90+ features made up of all sorts of factors?

Thanks

Jrinne · February 2, 2024, 7:28pm

Hi Marco,

Please understand that I actually know very little about what you are doing with the AI/ML and I will very quickly make a mistake if I assume too much.

So are you going to provide access to all of the features of XGBoost? I don’t actually know. If you are then this is already a feature. And you would not even have provide any written documentation as XGBoost is already well documented: Monotonic Constraints.

If not, then I am happy to give my take on a use-case as you have asked.

Use case:

I cannot get boosting to work without it. See my above results. Maybe you have had a different experience or others want to share their positive results using boosting without monotonic constraints. But my results are the only results I have seem posted in the forum so far.

So on the topic of ease-of-use:.

Many already use the factors in ranking system. Maybe they are even funding those ranking systems. They want to see if boosting will improve things.

So here is my question: It will be hard for them to look at their ranking system and see if they have checked “Higher or Lower is better” and make a list (tuple or dictionary) in Python that corresponds to what they have done in their ranking system? A single line of code for the list? I don’t know. Everyone will have to answer that for themselves, I guess.

Anyway, for me—a Python user using a DataMiner download—Pitmaster’s advice actually made boosting work where it did not before. Without it I would be forced to use something else like a random forest where monotonic constraints does not seem to make much difference for my factors or a linear method or keep doing what I am doing now (a machine learning method other than what you will be providing in your AI/ML).

Thank you for the question.

Addendum: Once I use monotonic constraints my results with boosting are EXTREMELY ROBUST. The parameter or hyperparameters have a wide range with similar results. I.e,. my results are basically the same over a wide range for min_samples_leaf, max_features etc. Night and day. Like turning a switch.

Jim

marco · February 2, 2024, 7:59pm

We are planning the first release with predefined models (algorithm + params). For example XGBoost Fast, Medium, Slow. with hyper parameters decided based on tests that we did.

We are restricting to predefined models because a badly configured param can cause the training to run for weeks and keep resources busy. This is a problem since we are using our own servers farm (which is small right now). Leveraging the cloud for high workloads is planned but it would have delayed everything. We need to know the appetite for this first.

Circling back… With predefined models we therefore cannot offer ‘monotone_constraints’ since it’s tied directly to a feature. So that would have to come later when we open up the parameters. We’d also need to figure out a good interface. Picture trying to set the constraints for 90 features listed vertically with horizontal, comma delimited values. ugh!

FYI AI Factor features are added using the same interface as the Factor Download tool. So setting a parameter like ‘monotone_constraints’ will have to be done differently.

Thanks

Jrinne · February 2, 2024, 9:06pm

Marco,

Thank you for the insight into the upcoming AI/ML release. Nice!

Thank you for your interest in monotonic constraints. My post was meant more as feedback to Pitmaster, acknowledging the value of his contribution in the forum, and as an idea for other people using Python now. It is nice to discuss the pros and cons for consideration in future AI/ML releases with you.

As always, I am happy that if I decide monotonic constraints has a high priority for me personally, I can implement it Monday using the Factor List Download Tool. And Tuesday if I figure out the API (which I am probably capable of).

The downloads and the upcoming AI/ML is a powerful combination for pretty much anyone interested in machine learning. Thank you for doing that.

Jim