LightGBM now available; BUT...

Dear All,

You can now use LightGBM algorithm, however the predefined models we released are not correct. I did a simple test using all three of them and got very decent performances which is encouraging. The best was LightGBM II, which also ran the fastest of the three. The slowest was LigthGBM I, so definitely the parameters need work.

We are working to fix the hyperparameters but it's time consuming. We decided not to hide the predefined models while we fix them hoping to get your suggestions. Also any suggestions to improve other predefined models are welcome (tuning models seems to be more of an art)

Thank You

The current definitions are these

LightGBM I

"n_estimators": 100, "max_depth": 4, "learning_rate": 0.1, "num_leaves": 16, "subsample": 0.8, "colsample_bytree": 0.8, "min_child_samples": 20

LightGBM II

"n_estimators": 300, "max_depth": 6, "learning_rate": 0.05, "num_leaves": 32, "subsample": 0.7, "colsample_bytree": 0.7, "min_child_samples": 25, "reg_alpha": 0.3, "reg_lambda": 0.3, "bagging_freq": 5

LightGBM III

"n_estimators": 500, "max_depth": 12, "learning_rate": 0.01, "num_leaves": 128, "subsample": 0.6, "colsample_bytree": 0.6, "min_child_samples": 10

The obvious problem is min_child_samples since the progression from I, II, III is 20, 25, 10.

Also LightGBM II has parameters not present in the other two models.

2 Likes

I tried the LightGBM on a few AI factors. LightGBM I was always the fastest for me, II second fastest and III slowest (still fast). III gave the best result every time (really good results).

I even can't believe the results. It seems too good and too fast

Ok now I'm convinced param tuning is an art. My results made no sense, but yours show a perfect progression of slower speed, better results when going from I to III.

Guess we'll have to leave them the way they are. But they sure don't make sense to me. We'll discuss with the data scientist.

1 Like

With these fast algos, would it be an idea to include Hyper Parmater Tuning in the P123 AI package?

Agree with Marco that param tuning is an art, especially for noisy financial markets.

Users should have at least basic knowledge how different parameters affect the model - how to construct: shallow, regularised, deep model. And p123 should let the users to play with them, provide highly flexible framework.

Please also note that all the models in P123 have the same seed unless you do not change it. This means that every time you run a model with the same parameters, you get the same results. But in reality most ml models introduce randomness in many ways. By selecting the best model based on only one seed, you risk overfitting validation results. It may be prudent run the same model multiple times with different seed.

An example is provided below. Results highlighted in yellow are based on the model the same settings but with different seeds.

2 Likes

I believe the most obvious omission is early stopping. E.g., "early_stopping_rounds": 50. Early stopping finds the optimal setting for n_estimators thru cross-validation.

Copying lightgbm III and adding "early_stopping_rounds": 10 lead to failed runs when I ran it. This is the complete code: {"n_estimators": 1000, "max_depth": 12, "learning_rate": 0.01, "num_leaves": 128, "subsample": 0.6, "colsample_bytree": 0.6, "min_child_samples": 10, "early_stopping_rounds": 10}

Several attempts to provide a validation set and eval_metric were not helpful.

Also there is a 'device_type' setting. I wonder whether gpu, cpu or cuda are optimal for the hardware provided. You might make that best one (based on the hardware) the default if this has not already been done.

Jim

LightGBM always looks the best

1 Like

Tones of parameters to test for LightGBM. Dart seems to work good on my data "boosting_type": "dart"

1 Like

Thanks. Just to expand on this.

drop_rate controls the amount of dropout. E.g., "drop_rate": 0.1 And it runs (unlike my attempts at early stopping)

BTW, both dart and early stopping are methods for regularization which help prevent overfitting.

If we keep optimizing the last 5 years of a sim using AI Factors in the ranking system, using all of the possible hyperparameters for each model, without using regularization and using multiple models we will begin seeing overfitting again. If we are not seeing it in the forum already.

We can validate and then use the last 5-years as a final test of the best model found in the validation. P123 makes this possible, if we want a more realistic idea of how a model will do when it is funded.

LightGBM has over a trillion possible combinations of hyperparameters if you can use monotonic constraints and maybe without it. I am sure I can find a combination that does well without proper validation and testing.

4 Likes

I can't find the post here or remember exactly what was said, but I believe Jrinne you had mentioned wanting to try "extra_trees": true with LightGBM. I can confirm that this was indeed very effective at improving results and really turned lightgbm around for me to become a promising model, so thanks for that tip.

3 Likes

Has anyone manage to impliment "early_stopping_rounds"?

I tried but couldn't (still).