PCA Formula Exporter: Automatically turn Principal Components into Ranking Systems

@Scifospace came with a brilliant idea to break down the result from a Principal Component Analysis (PCA) Components to new factors for P123.

While PCA is excellent for uncovering the "true" underlying drivers of stock returns by reducing multicollinearity and distilling correlated factors into independent components, translating the math into Portfolio123 formulas is incredibly tedious.

To make this effortless, I built a small desktop application: PCA Formula Exporter.

What it does: It takes your factor data (CSV export from Portfolio123), performs PCA, extracts the components, and automatically generates weighted formulas ready for Portfolio123. You can copy/paste them into a Linear Ranking System or use the "Copy P123 XML" feature to import a complete multi-component setup in seconds.

Key Features:

  • Automatic Weighting: Converts PCA eigenvectors into weighted P123 formulas.

  • Mapping Support: Import your original P123 ranking system XML or a CSV mapping to replace factor names with their actual formulas

  • XML Export for P123: Generates ready-to-import XML code for a new ranking system, with weights scaled proportionally to explained variance (summing to 100%).

  • Visualization: Includes Scree plots (explained variance) and Heatmaps to see which factors are clustering together.

  • Constraint Management: Automatically handles P123 character limits and scales weights to sum to 100%.

Quick Demo: I put together a short video showing the workflow—from performing the analysis to importing the XML into a Ranking System: https://x.com/AlgoManX/status/2008892694194765968

How to use it in short:

  1. Download the EXE file

  2. Upload a CSV from factor Download.

  3. Run the Analysis.

  4. Copy the generated formulas to a new P123 (classic) Ranking system.

I haven’t tested it extensively yet, but initial runs look promising. It works best with 10-30 factors—too many can exceed P123's character limits unless you tighten thresholds or use variance targeting to prune minor loadings.

Before asking questions, check out the built-in Help section (bottom of the app window) for detailed guidance, best practices, and troubleshooting.

Please use this thread to share your results, findings, or suggestions when using the app. I'd love to hear how PCA-based rankings perform in your backtests!

5 Likes

Nice @AlgoMan Just today I was fighting with some similar to this...my level of programming vs yours are still so far ;). I will take a look at it for sure!

Some questions @AlgoMan

  1. The Data Set that we upload should it be normalized? and should be the same that in ZSCore winsorize %?

  2. Should we upload the same features that are in the CSV and they should match with the Ranking System load in "Load P123 Script" right?

  3. And I see that you put other option CSV mapping for future develepoment , I guess (already in the help section)

1 Like
  1. Yes, normalized by date and the Trim/winsorize should match in the dataset download and the setting in the app.

  2. So when you create your dataset to download, if you chose to import factors from ranking system, you will then go back to the same ranking system and copy the XML code. This is to map name of features and formulas.

To get the XML code to past in to the '“Load P123 script”, go back to the ranking system you imported data from and click text editor.

If you would import features from AI Factor, you will then have to go back to the AI ranking system and download the CSV file to map the features.

So there is two ways to do it.

2 Likes

I have trialed with a 13 factor system.

I tried now a AI Factor analysis with the 13 factors/features including the new 5 new PC Components. As I expected, the first components gets a very large importance and the lower “noise” components gets a low importance.

1 Like

Brilliant work! This is a really clean, first-principles application of linear algebra — the kind of material that shows up in serious physics or engineering courses.

Edit after the last post: does that look a little like a scree plot? Do you use a scree plot or maybe parallel analysis?

2 Likes

I included the scree plot visualization.

3 Likes