All,
TL;DR: Just one csv file downloads would change everything for P123 forever. It starts and ends with the cvs file. Multiple ways to use it could and should be made available. It does not have to be and should not be made harder than that.
I recently requested that ranks be downloadable in the screener with more than 500 ranks available.
Dan informed me that I can get that in the DataMiner without any programming skills and I very much appreciate that.
I have mentioned before that I would like to in the future (and have done in the past) XGBoost with data from P123. I think the API was actually started when I informed P123 that the fundamentals are not necessary for XGBoost, Random Forests or neural nets.
It is debatable whether fundamentals would help regression. Maybe for some but I think one might try a regression with the ranks as the independent variable and get better answers than if you use the fundamentals.
Or you could just use the z-score instead of fundamentals and get the same answer. The potential for P123 is Ginormous, HUGE, AWESOME, Unlimited!!! There is no limit to the machine learning abilities at P123. I do not know what other adjectives to use.
I do not care if someone prefers spreadsheets. Please use spreadsheets, an abacus of whatever you like. You do you. Whatever floats your boat. I am not causing anyone any harm if I ask for some data in a formant the facilities machine learning.
TL;DR: P123 has absolutely incredible potential for machine learning. No fundamental data required. I thought that was already understood.
Dan was very kind and responded to my post. When I clarified that I only need ranks he responded (paraphrasing) that I should check out the DataMiner—which I am doing now.
ANY MACHINE LEARNING PROGRAM JUST NEEDS COLUMN HEADS OF FACTORS OR FEATURES (e.g., price to free cash flow) WITH THE RANKS IN THE COLUMN AND THE TARGET (OFTEN THE RETURNS FOR THE NEXT WEEK OR THE NEXT MONTH). And an index I guess which might as well be ticker and date.
I get now that P123 is never going to answer any questions about collinearity, how Ridge Regression could help with that or whether I want to use out-of-bag validation for a random forest, monotonic constraints for XGBoost or subsampling of the columns and that I will have to set up my own methods for early stopping. I get that now.
So again, I will do it myself one way or the other.
Please check if I am alone in this but it would be nice to have a dowload the has a column of the features (rank or z-score) and the next week or month’s returns.
All lined up without having merge returns (when I do should I do an external merge or internal merge and wha t if I want to sort more than one thing) and ranks with a sort.
If this is available would you please point me to it. If I am not alone in thinking that would be nice would you consider doing it. But I think that would be useful for ANYONE doing machine learning AND anyone wanting to use SPSS or JASP for that matter.
Newbies will use this. It should be FRONT AND CENTER ON THE HOME PAGE if you want to attract undergrads looking to manipulate data with Python, R, SPSS, JASP or whatever their professors like.
If P123 wants to buy new servers, hire someone who has some experience doing machine learning and take years to develop this that is fine. Keep this all mysterious and out of control of the P123 members with little ability to get requests answered (except when Dan responds) I just have one question: is that the best business model.
People, for now, will get immediate answers from ChatGPT as well as some help with any code.
Please point me to that DataMiner download or consider providing it if it is not available. I do not think you will regret it from a business perspective (irregardless of what I may want).
Ii will run it myself in Colab paying a fee for faster runs if necessary. I would prefer to avoid sorting separate, individual rank download and return downloads merging them (external merge or internal merge?) and hoping it all lines up with no errors.
And again thank you Dan for responding to a forum reques. The answer does not always have to be yes and you did respond. You understood when I clarified about ranks and responded again. Interacted
If P123 were to help newbies for now…how do I make the date the indiex?…How do I double sort ranke and date?….internatl merge or external merge?…Which sort is best?
ChatGTP is better for any manchine learning for now. P123 could at least help people who are not expert programs do some of the munging and data wrangling.!
So with no irony. Multiple downloads, Switch the indexes to date AND ticker. External merge. Then double index. Make sure the set how you handle NAs. Everyone can do that right? Plus it is fun.
Just one csv file download would change everything forever for P123 I think (in a good way). More effective and I would guess less expensive than anything done with the AI/ML to date.
Jim