DataMiner can be used to get the data you are looking for. Below is an example that shows two options you could use for the target and 2 examples of factors.
Main:
Operation: DataUniverse
On Error: Stop
Precision: 3
Default Settings:
PIT Method: Prelim
Start Date: 2020-01-01
End Date: 2020-03-01
Frequency: 1Week # ( [ 1Week ] | 2Weeks | 3Weeks | 4Weeks | 6Weeks | 8Weeks | 13Weeks | 26Weeks | 52Weeks )
Universe: DJIA
Include Names: false
Formulas:
#Target ie future excess returns
- FutRelRet_SPY: FutureRel%Chg(20,GetSeries("SPY")) #4 week future total return relative to the SPY ETF
- FutRelRet_Ind: FutureRel%Chg(20,#Industry) #4 week future total return relative to its industry
#Factors. Up to 100 formulas.
- FRank("EarnYield",#ALL,#DESC)
- ZScore("Pr2SalesQ",#All)
The output will look like this and be saved as a CSV file:
Let me know if you have any questions regarding DataMiner.
Jim - thanks for the compliments in your posts, but you are giving me way too much credit. My role here at P123 is mainly as a tester. Marius is the brains behind the API. Also, thank you for your time helping to guide other users. I learn a lot from your posts.
Thank you very much for your efforts Dan. Your example with the future excess return target saves me a lot of time, as I am just starting to learn about DataMiner. So, thanks in advance for your kind offer to answer all my future questions about DataMiner.
You seem to have advanced knowledge of neural-nets and may be interested in trying a neural-net at some point I would guess.
You could possibly use some of this Python/TensorFlow code as a starting point. I have left out all the the munging of the data and library loading. You will want to normalize and/or standardize the data. Just divide ranks by 100 to start with. TensorFlow does run with ranks up to 100 but more slowly and maybe with worse results. You will change the input shape and dim according to the number of features you use, of course.
I leave it to you to decide whether you want to keep BatchNormalzation(), Dropout(), and what level of DropOut if you keep it etc.
“logcosh” is a wonderful loss function but is is scale invariant? You will probably want to start with Nadam as an optimizer and perhaps end up keeping it. Relu works well for activation. I suspect you will find only marginal improvement with anything else.
My training is to start with a lot of layers and nodes and then “shrink” it down (as you can see). Of course, how to shrink it down (e.g., reducing the number of layers or using regularization/Dropout is a complex question). I leave the art of neural-nets to you and there is no reason to think the code below is anything other than a possible starting point.
Thanks Jim, I really appreciate your help and commitment to moving the AI issue forward.
I already have some experience with Tensorflow and Keras and it definitely makes sense to normalize the ranks to values between 0 and 1 for better training.
Currently I am still thinking about what methodology might be best for the task. One possibility could be to try to predict the alpha with neural networks. Another possibility, perhaps more promising one in my view, would be to use reinforcement training. Such training attempts to find the optimal weights of the selected ranks that maximize the alpha over, say, the last 10 years, based on a reward function that has yet to be defined. However, I am still at a very early stage of my thinking, but if I gain any new insights, I will be happy to share them here.
I was just exchanging emails with Jim and we would like to know whether you might be interested to explore using machine learning techniques (LTSM, Tensorflow, XGBoost) and reinforcement learning techniques to trade cryptocurrencies on Binance/Coinbase. (i mean bitcoin, dogecoin, litecoin, ripple…etc not GBTC/ETHE) given your previous experience in machine learning.
Kindly let us know your thoughts.
Daily historical data of various cryptocurrencies can be downloaded from investing.com.
James I think you are interested in trading strategies that cannot be easily backtested. Certainly not backtested effectively with P123. This is a situation particularly suited to reinforcement learning, I believe. And you recently became interested in how reinforcement learning could be used for this, in part because of this thread.
Reinforcement learning using “upper confidence bound” is is an obvious method when considering multiple trading strategies, I think.
This is a pretty good summary on the web: The Upper Confidence Bound (UCB) Bandit Algorithm. The amount of “charge” used as an example in the story can be replaced with “return for a trade strategy” in practice.
The equation can easily be entered into an Excel spreadsheet or programmed in Python.
Reinforcement learning is certainly an interesting topic and could be used for any situation where it is hard to run a sim on strategies that are not funded. When you have to fund it to really know how well it works.
Just some background on what we have discussed so far on the topic of reinforcement learning for trading strategies. Not that the upper confidence bound algorithm fully addresses James’ specific needs.