Random forests are the simplest things coders could ever look at

Coders at P123.

If you have ever looked at the code in a random forest prediction model it is all if, else. Nothing more.

The simplest (practical code) possible or near to it.

None of this is hard for good coders. Some people just want to make it look hard so they can seem very deep or mysterious.

Jim

Hi Jim,

Appreciate your wisdom, guidance and push towards AI/ML. I’m a newbie but very interested in acquiring deeper knowledge in this field. You mentioned in the past that Elastic Net worked well for you for feature selection and optimization of fundamental factors. Are you finding more success with Random Forest now or is Elastic Net still your preferred method or is it case-by-case? Any hints will be appreciated. Thank you!

Sam

Sam,

Thank you for your interest.

Just as an aside ElasticNet will give you weights that could possible be entered directly as weights for the factors (or features) in P123’s ranking system. There is a potential easy-of-use benefit there.

But that was not your question. I have limited data and the data I have been using has a potential look-ahead bias as Walter pointed out to me just this morning in this post: Data download the day of rebalance for machine learning - #6 by WalterW

I have done random forests before, but in retrospect my factor choice was poor then and I think you are better of if we ignore those results even if I remembered them correctly.

That being said the most direct short answer to you question is that with the limited and flawed data that I have: Random forests are doing better.

I did a grid cross-validation with these parameters ; param_grid = {
‘max_features’: [0.3, ‘sqrt’, ‘log2’],
‘min_samples_leaf’: [3000, 10000, 20000]
}

I got this output: Best Parameters: {‘max_features’: 0.3, ‘min_samples_leaf’: 3000}. I had tried smaller ‘min_samples_leaf’ in previous grid searches.

So, use some big’ min_samples_leaf’ numbers in your grid search is one piece of advice I have. Hope that helps.

I hope that helps some and just like you I will need more data for any reliable conclusions.

Jim

1 Like