Preview of our v1.1 of AI Factor: intelligent features, grid search, categorical & macro

Currently being tested. There were last minute changes because it was not easy to use with the many new options. We also revamped the predefined features focusing on what makes sense to feed to an ML algorithm. There were also some performance challenges since it requires a lot more CPU resources. We hope to release this week.

Here is the updated documentation of the new normalization types. Currently only the first one is available.

Normalization Types (coming soon)

Feature normalization is a process that transforms feature values to a similar range, and reduces the skewing effect of outliers. Normalizations can be either be:

  1. global: cross-sectional vs. other stocks in the entire dataset.
  2. by-date: cross-sectional vs. other stocks on the same date.
  3. local: longitudinal vs. historical values for the same stock.

Only the recommended normalizations are shown for a particular predefined feature. The complete list of normalization options is:

  1. Default Normalization (global or by-date): The default normalization for the AI Factor which uses either a z-score, min-max or rank scaler. It's a cross-sectional normalization by-date, or the entire dataset.
  2. Categorical: Values are passed directly to the algorithm.
  3. Min/Max (global, no trim): Normalize from 0 to 1 globally. Outliers are not trimmed.
  4. Normalized vs. Sector/Industry (by-date): Normalize cross-sectionally against the sector, sub-sector or industry for a particular date.
  5. PIT Normalization (local): Normalize locally with historical Point In Time values using FHist() functions.
  6. Loop Normalization (local): Normalize locally using Loop() functions.
  7. Series Regression Growth (global): Calculate the regression growth of a series formula, then normalize it using the Default Normalization.
  8. Series Regression Surprise (global): Calculate the regression surprise of a series formula, then normalize it using the Default Normalization.
  9. Series Regression R² (local): Calculate the regression R² of a series formula, then feed it directly to the algorithm.

NOTE: See the LinReg() reference for details on the regression function.

2 Likes