The P123 optimizer is an awesome tool that could be improved and made highly attractive to new and old members alike

Jrinne · August 19, 2023, 10:08am

All,

TL;DR: The optimizer—especially with some additional features—might be better than support vector machines and might gain wider support among the present members. Maybe it would help with any transition to other methods they may want to try later (e.g., try XGBoost). It could help by introducing some optional methods that are similar—making XGBoost’s early stopping more understandable, for example.

AND it allows one to use the result in a standard ranking system in a live port like we have all been doing for years!!!

So the optimizer is an awesome tool and many have used it to find good models. There is a method out there that uses a spreadsheet to randomize the weight of factors and loads this back into the optimizer. I will not go into the full algorithm for brevity and because I am not sure if everyone is using it the same.

Suffice it to say this is excellent, has served many people well and is actually a well accepted method in the machine learning literature. I believe this is close to (or the same as) gradient descent.

I would suggest that P123 discuss this with their machine learinng expert. Ask him how crazy this jrinne —who even uses an algorithm for picking restaurants with exploit/explore algorithms and is probably autistic or something—is reasonable in calling this gradient descent.

If your expert says: “Yea, I kind of get it”, then you should ask her: “Are there some basic ML tools that could supplement this gradient discount thing and make it a full, advanced machine learning tool?”

Also ask: "In that regard how hard would it be to make these standard ML methods optional for the optimizer:

Early stopping
K-fold cross-validation
Recursive feature elimination.
Bootstrapping or subsampling. Subsampling would be less resource intensive and mimics what is being done with mod() now so it is already widely accepted."

BTW, I prefer the more current term: “Spectrum Disorder.” Joking but what is with me seeing math in everything? Not that it isn’t a useful too for annoying everyone around me.

I think it would work and could be marketed to new machine learners. I think it would help retain some existing members who like this general method if it improved their out-of-sample results. I doubt that it would be more resource intensive than the other ML methods discuss but I am least certain about this.

Jim

WalterW · August 19, 2023, 12:39pm

This would be very useful, I think. The rank optimized is pretty much non-functional now and augmentation with external scripts just make the workflow awkward and tedious.

Probably less resource intensive overall considering the current workflow.

I’m curious to see how the ML effort affects the current workflow. If components like rank optimizer get improved, then all tiers of p123 subscribers would benefit and that would be great.

Jrinne · August 19, 2023, 12:55pm

Just to add, P123 could charge extra for “early stopping” protocols or cross-validation according to how much additional usage it creates.

I would pay for that with a fully automated optimizer (with some of these features) in a second without thinking about it twice. And early stopping may reduce the resource usage as Walter suggests (it does stop it earlier and I assume no further proof is necessary to make the assertion it could reduce resource usage).

K-fold cross-validation would probaby increase resource usage and you might want to charge me for that feature. I do not claim all of this would reducer resources. But then again if I had a model that was working out-of-sample I might not do as much training long-term.

Recursive feature elimination would reduce resource usage (lfewer features used in models equals less computer time, I think).

Anyway, maybe less usage overall and/or we could pay for what we use which is only fair (and possibly profitable for P123).

smian · August 19, 2023, 2:28pm

I completely agree with you, Jim. Given the rapid advancement and adoption of machine learning in today’s world, it seems counterintuitive to manually perform rank optimizations using cumbersome and static methods. We should be capitalizing on the capabilities of ML algorithms directly on P123, especially when computing resources are so affordable. P123 could introduce an AI-driven credit system, akin to the API or Data credits, ensuring a positive return on investment for product feature. I’d be happy to pay for this to save time on coding and focus on feature engineering, modeling, analysis, etc.

Jrinne · August 20, 2023, 8:59am

TL;DR: P123 is already working on an AI/ML release and I should not have focused too far ahead of that, I think. My purpose was not to exclude what others are already doing from the discussion. I think people are doing some good things but it is largely on their own—including finding tricks with spreadsheets and Mod().

Original post: So, if P123 did this we would need a metric………

Edit: what we do now with the optimizer and spreadsheets is more appropriately called stochastic search or a random search method, I think. Not gradient descent as I have said previously. Not really pure financial analysis either, however.

I’m not really for this anymore unless it is handled by whomever is in charge of AI//ML development and she does something with cross-validation and has an improvement over the present use of mod(), at least. I think early stopping is a good idea but the AI/ML expert might have some better ideas. Maybe she would find a way to do true gradient descent , and if so, maybe that would be preferable for some reason.

I don’t think the forum can develop a consensus on this. I will be most interested in what P123 has to offer with regard its upcoming AI/ML release

And what I can do with DataMiner downloads over at Sklearn which is already well developed and not requiring us to reinvent a method for K-fold cross-validation, bootstrapping, regularization and recursive feature elimination.

I think we would get bogged down with the question as to whether this is machine learning, reinforcement learning, or financial analysis before we even got started. The forum can’t do it.

Even if people were focused on making it work, I don’t think we could agree on a metric which is what I started to post about. We could not go ahead without that.

Please, give me daily updates on the DataMiner downloads for machine learning rebalances so I can use my own metrics and all of the tools offered by Sklearn, P123. I assume you are already working on that and that we have not heard from you because you are going to make that very easy and possibly more affordable for newbies interested in using P123 for machine learning.

You will have to inform us about your methods for cross-validation, at least, when you release your AI/ML. Especially, if you do not have daily rebalances with DataMiner available by then. Advanced members could ignore some early issues with the AI/M release. Help with any early issues if we are invited to do so. While we do our own cross-validation, recursive feature elimination and other important things over at Sklearn with the DataMiner downloads. And then use the trained models for rebalances.

Jim