Max_features = "log2" vs the default

max_features = "log2" saves some time (Used log2 in Extra Trees III(2) with everything else being a copy of extra trees III):

Equal or maybe even better performance:

log2:

Default:

I always use max_features = "log2" now. I had been using max_features = "sqrt" (which is similar) until recently. Might be a nice default for some future models.

Jim

2 Likes

Indeed, max_features (mtry in R implementation) is important:

Out of these parameters, mtry is most influential both according to the literature and in our own experiments. The best value of mtry depends on the number of variables that are related to the outcome.

In addition to that, some researchers found that max_features should be supported with a variable selection method to obtain optimal performance only if our dataset contains noisy (irrelevant) features.

1 Like