NEW: Data Miner App & P123 API -- v1.0 (beta)

marco · May 17, 2020, 9:13pm

I think they were cutting deals around 10K-12K/y last year. You can ask Joseph, joseph.smith at spglobal dot com

Factset likely be somewhat cheaper , but unlikely under 8K-10K/y. You can ask Caitlin, cdiehl at factset dot com

Just say you want to download data through P123, and that P123 gave you the contact to get the license.

philjoe · May 18, 2020, 1:04am

I’m not exactly sure what this thing does, it allows you to output the rankings of all stocks in your universe over time?

marco · May 18, 2020, 1:14am

philjoe try running the samples. They are in the ‘samples’ folder: https://www.dropbox.com/sh/08lu93jqio254m2/AAB8f3zelalF3DOE0yEJfvaCa?dl=0

Also take a look at the Data Miner Intro document.

philjoe · May 18, 2020, 1:44am

ok thanks

philjoe · May 18, 2020, 1:48am

Does P123 use python for everything in the background?

philjoe · May 18, 2020, 2:06pm

Does this cost extra, or do we just keep creating new API keys when we’ve used up the 500 limit?

yuvaltaylor · May 18, 2020, 3:14pm

What 500 limit are you referring to?

hemmerling · May 18, 2020, 4:25pm

2020-05-18 09:24:32,394: API request failed: ERROR: You are limited to 250 requests per hour. Please wait until 11:36 AM (12 minutes from now) before making additional requests.

Quantonomics · May 18, 2020, 4:55pm

Let’s say I made a file for 1000 requests. Is there a way you can program the Data Miner to do 250, wait for an hour and continue with the next 250 an hour later, etc. Without needing me to manually split into 4 files of 250 each? Can you throttle the file so it doesn’t defeat the purpose / making more work for nothing? Thanks

dnevin123 · May 18, 2020, 4:59pm

When using the python library how do you specify the “Columns” parameter for the Ranks function?

For example, I tried the following

rankdetails = {
  "engine": "Current",
  "vendor": "Compustat",
  "pitMethod": "Prelim",
  "rankingSystem": "Core: Value",
  "universe": "Prussell 1000",
  "asOfDt": "2020-04-20",
  "includeNames": False,
  "includeNodeDetails": True,
  "columns": "factor"
}

tmp = client.rank_ranks(rankdetails)

but get the following error back

ClientException: API request failed: Unrecognized parameter <columns>

If I don’t include the “columns” parameter in the dictionary, then everything works but I just get the aggregate (or top-level) rank and none of the underlying factor ranks.

Thanks,

Daniel

philjoe · May 18, 2020, 5:19pm

I was doing a rank period thing and it maxed out and gave me an error at 500 periods.

Jrinne · May 18, 2020, 5:26pm

P123,

I do think this is very nice, has great potential and is a clear improvement in the volume of data that can be downloaded.

But why is it that we don’t just expand on what we can do now with Excel downloads?

As it is I am going to use this for the sole purpose of getting an expanded download of something that was never limited by the data vendor (it seems) into an Excel spreadsheet on my office computer so I can upload it into my MacBook to use in Spyder, Jupiter Notebooks or Google’s Colab.

Again, good that I can download more data now. But why not just expand the Excel downloads?

I am sure there is a good reason–that I am not aware of–so I will not suggest that, I guess.

But it does seem (to this non-programmer) that P123 has to generate the data and download it either way. Are you doing (and creating for members) a lot of work that may not be necessary for even the most advanced users?

I do get that Excel limits the number of rows to just over a million (1,048,576 rows by 16,384 columns) and that means maybe downloading 2 spreadsheets for the most demanding of data requirements. I will be using less than 16,384 factors (columns) most of the time;-) The row limit should never cause a need for more than 2 spreadsheets with any reasonable universe.

Best,

Jim

philjoe · May 18, 2020, 6:59pm

I created a new API key but now I am getting:

2020-05-18 13:58:28,913: API request failed: request quota exhausted

philjoe · May 19, 2020, 12:22pm

Still getting “request quota exhausted” this morning.

marco · May 19, 2020, 12:53pm

philjoe, please try it now. We changed the api monthly limit to 5K/mo from 500 for now till we figure out what to do with API . We’re also keeping it limited to make sure we can handle the load. The idea is to have it included in p123 memeberships for light use, and have few options for power users.

marco · May 19, 2020, 1:06pm

When using the python library how do you specify the “Columns” parameter for the Ranks function?

For example, I tried the following
rankdetails = {
  "engine": "Current",
  "vendor": "Compustat",
  "pitMethod": "Prelim",
  "rankingSystem": "Core: Value",
  "universe": "Prussell 1000",
  "asOfDt": "2020-04-20",
  "includeNames": False,
  "includeNodeDetails": True,
  "columns": "factor"
}

tmp = client.rank_ranks(rankdetails)
but get the following error back
ClientException: API request failed: Unrecognized parameter <columns>
If I don’t include the “columns” parameter in the dictionary, then everything works but I just get the aggregate (or top-level) rank and none of the underlying factor ranks.

Thanks,

Daniel

You just need the “includeNodeDetails” parameter to be true, no “columns” parameter supported (https://api.portfolio123.com:8443/docs/index.html#/Rank/_ranks).

philjoe · May 19, 2020, 1:10pm

OK thanks, but I think even 5k/month is too small to use it properly. Would 5k/day be too much to handle?

Quantonomics · May 19, 2020, 1:18pm

@ Marco Please advise:

Aggregate(“EarnYield”,#Industry,#Avg,16.5,#Exclude,False,True)

What would be the right syntax in the Data Miner for this formula? If I understood the read me all I have to do is to put quotes to pass the whole thing formula as a string, so:

“Aggregate(“EarnYield”,#Industry,#Avg,16.5,#Exclude,False,True)”

Is that it? Thanks

valmarv · May 19, 2020, 1:34pm

This would be the correct value (please note that you have to escape inner double quotes):
“Aggregate("EarnYield",#Industry,#Avg,16.5,#Exclude,False,True)”

dnevin123 · May 20, 2020, 6:06pm

Had been using Python with Selenium to extract Rank data on the Buy and Sell dates for positions in some of my Simulations. Yesterday I transitioned to using the new API which seems much simpler, quicker, and cleaner to use, but I have run up against Request Quota today. What is the current Request Quota limit? Do I just need to revert back to using my old method?