We are still dealing with backend issues for AI Factor. The network storage it uses needs to be tuned. Sorry it's taking so long. This directly affects several things in AI Factor:
Loading datasets
Model validation
Predictor training
Predictions
Please note that this network storage (a CephFS system), is only used by AI Factor, so nothing else should be affected. Things work, but are very inconsistent, with huge outliers in delays, as the network storage is running into issues keeping things consistent. Here's an example of two exact same validations, both started at the same time, with one taking 30 sec once, the other taking 16 minutes.
Anybody else having issues with timeouts on rebalancing AI factor systems? I’m still seeing problems where even after multiple attempts my rebalance is failing.
I am still having the same issues I was having earlier. When I update the number of buckets or slippage the loading times out. I also cannot delete any AI Factors with an unexpected error message in a red pop up.
Thanks to you and the team for your efforts, Marco. The delays and timeouts are a minor inconvenience at the moment. I appreciate your frequent status updates and am sure y’all will get it tuned correctly.
The majority of my live strategies (with AI factors or not) did not rebalance this morning and are still showing R status.
I would like to understand the best approach for today’s rebalancing:
should I proceed manually (not sure if this will work),