Imported Stock Factors - Bulk Upload

jpromoli · November 8, 2023, 1:59pm

I’ve been using Imported Stock Factors for a handful of data points - is it possible to set up an interface to import data for multiple factors at once?

marco · November 9, 2023, 12:58am

It’s possible. How many are we talking about? How do you plan to use it exactly?

The current implementation is “version 1.0”. It does not scale well and has some gotchas that you need to be aware of.

A v2.0 is on our top of the list to-dos because it will be very useful for users of our AI/ML product. The ability to upload your own data in bulk to leverage the AI/ML platform we are launching will be very important.

jpromoli · November 9, 2023, 1:15am

I’m just using five imported data points as part of a ranking model so every couple days I’ve been uploading five different files individually to update them. It’s not an issue at all to do them separately but loading a single spreadsheet with all of them together would be helpful.

marco · November 9, 2023, 3:20am

We have an API to upload stock factors but requires a bit of programming. It can be easily done in a Jupyter notebook for example. We create an API wrapper that makes using the API very easy. It’s described in this KB article The API Wrapper - p123api - Help Center

Another possibility is to have a programmer add it to our DataMiner, which is our opensource, “no code” data mining tool which you install on your computer. The source is in here portfolio-123 (Portfolio123) · GitHub

jpromoli · November 9, 2023, 1:19pm

Ok thank you. I’m not sure what most of that means but I appreciate the quick response. I’ll keep loading the files separately for now.

abwillingham · November 9, 2023, 2:04pm

Marco, that sounds very interesting. Can you expand on the kind of bulk uploaded data you anticipate being useful for the ML users?

Thanks
Tony

marco · November 14, 2023, 12:03pm

Currently

Just more use-cases for our product.

Buying datasets is expensive for us since we need a redistribution license, so we want to make it very easy to use your own data. But users that purchase the data might find our tools too limiting (ranking & screening essentially). Having another way to process the data, together with everything else we offer, should start making P123 a cost effective solution for different users. But factor import needs to be revised to be more efficient, and handle several millions data points (for example 5K stocks x 10y x 52w x 20 factors = 78M data points). Currently it strains to handle a few million.

Plus we think we are making ML quite easy to use, so we want to leverage all the work we put into it as much as possible (like macro and tic data for futures?)