Genetic algorithm to replace manual optimizaton in P123 classic

Thank you. Interesting you say that. I use 150 a lot of the time which I arrived at through trial and error. Makes a lot of sense. Easy to change in the code.

1 Like

Thanks @Jrinne
This looks like your result file. I was more interested in your input file.
Is it safe to assume you generated it from DataMiner?

Thanks
Tony

Sorry. I may not understand still. The screen shot above it all a download from DataMiner (or the newer download option) except I change the names of the features e.g., to 'Feature1' which replaced the name in the P123 download. The only other change was to create the 'ExcessReturn' column and delete some columns (features).

The other way I can illustrate what that file is is to say this is a screenshot of df1 = pd.read_csv('~/Desktop/DataMiner/xs/DM1xs.csv', parse_dates=['Date']) in the code.

I don't think there is anything in the code that adds the results to the csv file.

The target 'ExcessReturn' is future returns generated by P123 Future%Chg(5) minus the return of the universe for that week.

But the predicted returns are never added to the spreadsheet.

Sorry if I still do not understand.

@Jrinne I understand now. Thank you!

1 Like

All,

Are you considering using this? If so I have some additional data that may help make that as to whether to try it. I refined the code to do a 5-fold cross validation (no gaps and shuffle = False) to get the validated results. And I did also ran code that did not allow negative weights (but might set a weight to zero).

Model 1 (Allow Negative Weights): CV Fitness (Average Weekly Return): 0.9186

Model 2 (Restrict to Non-Negative Weights): CV Fitness (Average Weekly Return): 0.9086

People can draw their own conclusions about whether to try this at home with their own data and code modifications.

Please consider sharing your results if you do decide to do that

I have re-written the code above and I am now exploring maximizing the Information Ratio (with the universe as the benchmark). The Information Ratio is mathematically equivalent to Cohen’s D in statistics, which is a measure of effect size—a concept widely used across scientific disciplines.

I think the Information Ratio is a better metric than the Sharpe Ratio because it isolates stock selection skill by removing market noise. Cohen’s D is ubiquitous in statistics because it measures both practical significance and statistical significance, making it a robust choice for ranking systems.

It’s interesting to note that finance likely borrowed from statistics when developing the Information Ratio, given how closely it aligns with Cohen’s D.

There are probably other great metrics, and several have already been discussed in this thread. But I think this provides a better answer than I previously gave on what makes a good metric for ranking optimization.

In addition to this excellent idea this would lend itself to integration of a strategy's Alpha as a metric. I believe you have been requesting alpha as a metric for the AI for a while now. That would be possible with a genetic algorithm or a modification of the code above.

Alpha as a metric is something I might explore myself.

For anyone considering this the compute resource requirements are pretty favorable. Parallelization can be implemented with DEAP (DISTRIBUTED EVOLUTIONARY
ALGORITHMS IN PYTHON) as the name implies.

2 Likes

The Sharpe ratio has a huge problem and it's much much worse with the information ratio. If the average returns are negative (or, with the information ratio, if the average returns minus the bench returns are negative), then portfolios with a great deal of volatility will score better than portfolios with low volatility. This is exactly the reverse of what you want. So if you're using one of those ratios, be sure that your code does something about this flaw.

These ratios have other flaws too, but that's the biggest one. The others are:

  • They're calculated on the basis of average returns. This makes no financial sense. If your portfolio goes up 20% and then down 20%, you have an average return of 0%, but you've actually lost 2.02%.

  • They operate on the assumption that returns are normally distributed. Non-normal distributions distort standard deviation numbers terribly.

  • They're extremely sensitive to outliers.

If you're going to use the Sharpe or information ratio to optimize a system, I would advise trimming the returns to eliminate outliers, using geometric mean in the numerator, using absolute deviation instead of standard deviation in the denominator, and if the numerator is negative (which is quite likely with the information ratio), multiplying by the deviation rather than dividing by it.

2 Likes

Yes, geometric mean. Check out the Safe Haven book by Spitznagel.

1 Like

What really creates a problem here is the negative coskewness. This is difficult to avoid for strategies with positive alpha, though generally this can be factored in by deducting from the alpha the expected cost of the option needed to hedge the corresponding negative coskewness.

Of course, this actually has only a negligible practical impact on strategy selection. It mostly makes one feel like one has something in the works in the face of the inevitably great uncertainty of the market.

https://sci-hub.se/https://doi.org/10.1287/mnsc.2019.3419

1 Like

I sometimes use VIXM for another beta (the tail risk beta) when calculating alpha

ZGWZ,

Nice point! Without a doubt, I’ve been guilty of more than a little action bias—okay, a lot of action bias on my part. It’s not even a close call—I’ve absolutely been guilty of it in the past.

If by “Action Bias” you mean spending years optimizing feature weights when equal weights perform just as well, then your post is highly relevant to this thread.

Why?

Because once we accept that equal weighting is nearly as effective as optimized weighting, the problem of ranking system optimization becomes far more tractable. Instead of fine-tuning weights, the challenge shifts to a binary decision:

Should a feature be included or not?

This makes the problem computationally lighter and well-suited for genetic algorithms. The coding for such an approach is straightforward enough that I won’t include it here.

Of course, this only applies if one assumes that equal weights perform nearly as well as optimized weights. I’m not making a claim about whether that’s universally true—just that, for those who hold this assumption, a genetic algorithm can likely find a near-optimal solution without excessive complexity.

This is a method that can be fully automated, literally avoiding any human action at all—so clearly, it has the potential to avoid a bit of action bias in the process.

That said, this approach isn’t for everyone, and I’m not suggesting that ZGWZ (or anyone else) would necessarily consider this the best way to optimize an equally weighted ranking system.

It just connects two themes in this thread in a way that seems obvious—once you see it.

ZGWZ, whether I am echoing your point exactly or not, you have some excellent ideas about action bias and simplifying the task of developing a ranking system.

Edit:

This could be just a starting point—a true screening mechanism. It could identify features that, based on a user-defined metric, both add excess returns [or put your metric here] and work well together (at least at equal weight).

From there, one could take creative approaches—including additional methods that explore variable weighting.

Nothing would stop a person from adding or removing a feature based on domain knowledge at any point in the process!

Cohen's d is not the information ratio in the investing problem. The correct calculation would be alpha divided by the pooled variance.

The “pooled variance“ of different strategies may be obtained by generating a large number of strategies with different seeds that generate random stock purchase decisions, but this is very likely to significantly underestimate the pooled variance.

1 Like

Thank you ZGWZ!

You are right and this is a very sophisticated understanding of Cohen's D. I might have tried pooling the universe standard deviation and the standard deviation of the strategy to get a better approximation of Cohen's D and the pooled variance you describe.

Claude 3 describes a "Monte Carlo" method to get pooled variance (which may have some similarities to what you describe).. And says this about it :

  • Creating a more robust estimate of the variance distribution
  • Better representing the true "population" of potential strategy outcomes

Interesting! I won't try to expand on the advantages of your method and I welcome any further corrections on my attempt to explain its advantages here.

I’m continuing to explore the use of a genetic algorithm for feature selection, applying equal weighting to selected features. This approach, inspired by your discussion, helps keep computational demands manageable while still achieving a near-optimal solution on a MacBook Pro.

I say near-optimal because, as you can imagine, the algorithm can get stuck in local maxima . However, it does consistently converge on a solution, and I’m considering ways to address the local maximum problem.

Thanks again for your insights!

Edit: It occurs to me that genes in the real world are either turned off or on. Their effect can be modulated but this is through other genes that are turned off or on. To the extent that features are like genes we are following a proven, time-tested algorithm when we use binary inclusion or exclusion of a feature. One developed by God or nature herself. So it's not entirely untested.

Hmm…Glad natural evolution did not get stuck in a local maximum….some mutated version of a slug say :thinking: :open_mouth:

On a serious note: Regarding natural evolution, the asteroid that wiped out the dinosaurs may have shaken things up just enough to move us out of a local maximum—literally.

Although genetic algorithms are based on this belief, we don't even know if evolution optimizes a universal function (such as Dawkins' belief in “God's utility function”) or tends to deteriorate overall.

And because of the universality of GxG and GxE, there is good reason to believe that natural selection tends to produce long-term harm (which can manifest itself in disease, aging, and a myriad of other nasty things) for the sake of short-term adaptation and that overall the process is very slow (because it does not exceed the Haldane limit for slow reproducing species such as humans and large slow-growing trees). Too rapid adaptation can lead to falling into temporary local optima. Even with genes that are turned off as options to prevent this risk, there is still reason to believe that neutral (slightly malignant) variation is what is prevalent and that the main effect of natural selection is to eliminate the vast majority of obviously malignant variation, not to produce or select for better variation.

Yes, there are no blind watchmakers or selfish genes, only largely meaningless random stupidity and, more likely, worsening genes.

While I agree that revolutionary events like asteroid impacts tens of millions of years ago (I'm not talking about "revolutions" like the "Industrial Revolution," "Quiet Revolution," or "Internet Revolution," but something closer to a political revolution) are more likely to have significant benefits, I don't think these benefits come from "getting organisms out of local maxima." Simply put, they were never close to a local maximum in the first place.

Since the raw meta-analytic data for the stress gradient hypothesis does not hold up even under the PEESE correction. This implies that the opposite is actually true. So the current severe deterioration of cut-throat competition is more likely due to a deterioration in the overall environment.

1 Like

AND some implementations are not particularly fast. E.g.,:: "Total Runtime: 39h 57m 53s"