Are Evolutionary Algorithms Computer-Resource Intensive?

All,

Link: (Subscribe to read). I somehow got past this paywall using google search: evolutionary algorighm SentientTechnologies financial times

"Sentient Technologies, an artificial intelligence company, plans to open its experimental hedge fund using “evolutionary” algorithmic trading to outside investors as early as this year….

...its hedge fund arm utilises an exotic form of artificial intelligence known as “evolutionary computation” inspired by how species develop over time."

Practical considerations:

  1. Would solve P123's costly memory-storage issues. To use a random forest to predict, there is a long series of if/elif lines of code to store in memory. For an evolutionary algorithm there would just be the rank weights to store in memory. People doing this or using similar methods now have been doing it for a while without the need for new servers.

  2. You could use those weight for further optimization later. So a "Warm Start" and "online-opimizatoin" are easy.

  3. And it is interpretable which is a big reason that some members prefer this method or similar methods over the new AI/ML.

Using P123's optimizer and a spreadsheet to randomize rank weights seems to be an established method. People at P123 use multiple universes. Different time-periods, and/or mod() to find "correlation among universes." Maybe for model-averaging or cross-validation too, although we seldom call it that.

What do you call the method? So I am going to call that jrinne's or [your name here] proprietary algorithm and pretend no one else has ever thought of it before and there is no room for improvement. Pretend that my personal modifications of an evolutionary algorithm are vitally important and new with no room for improvement or automation.

But I am not entirely against formal cross-validation and automation coupled with this method if it makes my life easier—even if I have to acknowledge that Sklearn, at least, has done this before me in order to use their existing library.

While clearly inferior to jrinne's proprietary method, I am considering using P123 downloads and maybe this program with cross-valdiaton and/or early stopping: Sklearn-deap

I don't mean to take credit for someone else's ides with "jrinne's proprietary algorithm." I actually think evolutionary algorithms were developed before I learned to speak in the 1950s:

"Alan Turing proposed the idea of using evolution-inspired processes for problem-solving in his paper "Computing Machinery and Intelligence" (1950), though he didn't implement an actual algorithm."

Yuval, SteveA, Whycliffe's and others clearly deserve credit for the first P123 implementations of methods that have any similarity to evolutionary algorithms.

My actual point is that is is not a secret proprietary algorithm (not jrinne's at least). Rather, my point is that is an established algorithm that is already automated in Python and might be improved upon. That there is a library that can be used for those wanting to automate this and/or couple it with cross-validation.

And that @marco and @Riki37 might want to consider automating it and improve it rather than abandon it. Or more accurately, I guess, leaving it unfinished.

I think I will be looking into it myself using the API or P123 downloads. I even have some early code that will need modification, I am sure:


import pandas as pd
import numpy as np
from sklearn.model_selection import TimeSeriesSplit
from deap import base, creator, tools, algorithms

# 1. Data Loading and Preprocessing
def load_data(file_path):
    data = pd.read_csv(file_path)
    return data

# 2. Feature Scaling
def scale_features(data):
    # Implement feature scaling (e.g., StandardScaler)
    pass

# 3. Time Series Split
def create_time_series_split(data, n_splits=5):
    tscv = TimeSeriesSplit(n_splits=n_splits)
    return tscv

# 4. Fitness Function
def evaluate(individual, data, features, target):
    # Implement your evaluation logic here
    # This could involve calculating portfolio returns, Sharpe ratio, etc.
    pass

# 5. Evolutionary Algorithm Setup
def setup_ea():
    creator.create("FitnessMax", base.Fitness, weights=(1.0,))
    creator.create("Individual", list, fitness=creator.FitnessMax)
    
    toolbox = base.Toolbox()
    # Define how individuals and population are created
    # Define genetic operators (crossover, mutation)
    
    return toolbox

# 6. Main Evolutionary Algorithm Loop
def run_evolutionary_algorithm(toolbox, data, features, target, n_generations):
    # Implement the main EA loop
    # This will involve selection, crossover, mutation, and evaluation
    pass

# 7. Results Analysis and Visualization
def analyze_results(best_individual, data):
    # Implement result analysis and visualization
    pass

# Main execution
if __name__ == "__main__":
    data = load_data("path_to_your_data.csv")
    features = [...]  # List of feature columns
    target = "excess_returns"  # Or whatever your target column is
    
    scaled_data = scale_features(data)
    tscv = create_time_series_split(scaled_data)
    
    toolbox = setup_ea()
    best_solution = run_evolutionary_algorithm(toolbox, scaled_data, features, target, n_generations=100)
    
    analyze_results(best_solution, data)

In my experience and in many papers, linear methods have a strange advantage in large cap stocks. Conversely, non-linear methods have an advantage in microcaps. I would have expected the opposite, i.e. that more complex methods have a (greater) advantage in the more competitive large cap stocks.

In microcap stocks, even the Fama French eight-factor model (FF5+MOM+STREV+LTREV) has weirdly superior results when you can access the AI tool. And these effects did not decay after the factor model was formalized by them in 2017.

Edit: In fact, my experiments using raw data as features tend to capture things similar to the FF8 factor model.

1 Like