AI Factor - Enable SKLearn Ridge Regression Model

The current linear ridge model under AI Factor relies on the ElasticNet model from SKLearn. This implementation of linear ridge regression performs very poorly compared to the Ridge model provided by SKLearn (sklearn.linear_model.Ridge).

When using the current linear ridge model in AI Factor, the model has an extremely long runtime and it will often fail to converge (sends a message to the user to use the Ridge model instead). In testing on my personal machine, the Ridge model shows significantly better performance.

Cheers,

Daniel

3 Likes

I second this request for sklearn Ridge support, please. I haven’t had luck getting ElasticNet to converge with L2 regularization, which hasn’t been a problem when using sklearn Ridge directly (offline).

1 Like

According to ChatGPT this is a subtle "gotcha": ChatGPT - ElasticNet vs Ridge differences . It recommends using these parameters to get very close to sklearn.linear_model.Ridge() when using ElasticNet:

alpha=2.0, l1_ratio=0, fit_intercept=True, max_iter=10000

This is not what we use for our pre-defined “Linear Ridge” model. So it's all quite confusing.

We're investigating the best solution.

Current implementation

We expose a single "Linear" model that behaves like this:

  • When alpha = 0.0 we use sklearn.linear_model.LinearRegression() which silently ignores parameters like li_ratio and max_iter (which is confusing).
  • When alpha > 0 we use sklearn.linear_model.ElasticNet()

Simple Solution

Still rely on a single "Linear" model and force the following behavior based on hyperparamenters:

  • With alpha = 0.0 it remains the same as above
  • With alpha > 0.0 and li_ratio = 0.0 we will use sklearn.linear_model.Ridge()
  • With alpha > 0.0 and li_ratio = 1.0 we will use sklearn.linear_model.Lasso()
  • With alpha > 0.0 and li_ratio between 0 and 1 we will use sklearn.linear_model.ElasticNet()

This should be quick for us and we can fix it this week. We just need to tweak the linear model code, modify the predefined models and gridsearches, and edit the documentation.

The downside is that it alters models that currently use either li_ratio =0.0 or li_ratio =1.0. But, since nobody is having luck with these settings, will this a problem?

More involved solution

Add 4 new specific linear models: LinearRegression, Ridge, Lasso and ElasticNet for a 1:1 correspondence with scikit. Deprecate the general "Linear" model.

This will be more inline with sklearn but more complex and time consuming for us. Couple of weeks for sure.

Let us know.

Thanks

1 Like

I don’t think it would cause a problem, but what about an alternative solution of leaving the current Linear implementation as-is, and just adding a new Ridge model?

I think having the four separate models (1:1 with sklearn) is the cleanest solution and at least this gets you a step in that direction.

1 Like

I’m fine with the proposed Simple Solution if that makes things simple for you guys. Having the four distinct models seems like a less confusing long-term solution for users, but if it is a significant amount of work it might not be worth the time.

Cheers,

Daniel