Hi, what is the rule for automatically buy when there is a new stock that enter rank 5 or higher and then also sell the lowest rank stock
Max position is 20 in simulation setting
Thanks,
Benny
Hi, what is the rule for automatically buy when there is a new stock that enter rank 5 or higher and then also sell the lowest rank stock
Max position is 20 in simulation setting
Thanks,
Benny
It's not supported yet.
It's the most common way people invest, when you find something new you want to buy, you sell your least attractive holding. We are doing it reversed which is hard to get used to.
I remember Warren Buffett telling his investors in the 1960s that he didn't like to sell stocks he owned in order to buy stocks he liked.
I think a better approach would be to calculate the difference between the current ranking of the best available but not purchased stock and the worst ranking of the currently owned stock, and trade when that one difference is greater than a certain value. This seems to be achievable in the current system.
With ML you would want to use the delta in the predicted return or the stock you are considering buying (and do not hold) and the stock that you hold with the smallest expected return.
Buy if the expected value or utility of the transaction is > 0. Decision theory 101.
Before that you need to calibrate your predictions via lift.
Right or P123 could just provide the raw (not standardized) expected return on the predictions page.
Calculate it with the API for now.
And correct. You want to ensure that the predictions are unbiased with lift and/or other methods.
Also an accurate equation for the transaction costs as part of the "predicted utility" equation is needed.
With some algorithms you need to calibrate even if the output is not normalised. For example, the output of a linear model often needs to be calibrated because it tends to be inaccurate at both ends.
The marginal effects of trading are also difficult to calculate, requiring the prediction of multiple targets with different horizons, and the calibration of empirical formulas through backtesting.
The end of the road is to compute the best weights/trades directly from machine learning without the need for return estimation. Maybe you could even use the reinforcement learning. However, I think that is beyond my pay grade.
Very true. Correcting a biases or at least checking for bias may be another way to put it. I did (no longer using P123's AI) have models where the lift looked pretty good.
TL;DR: You are right. No doubt about it. Excellent point
So, I think we all use reinforcement learning. Reinforcement learning is just a set of "policies" that are measured in terms of "regret."
Regret has a simple, intuitive meaning here as in: "Damn, I wish I had used that other strategy." True, reinformcent learning would try to make you feel more regret by quantifying how much richer you would have been if you had used the other strategy.
Sims are a way of quantifying "regret."
We use policies for selecting factors for sure. Using false discovery rate would be a policy. Using a particular ML algorithm is obviously machine learing but it is also a policy.
I have posted before that saying we are following a set of policies even if we are fundamental analysts is a way to put us all under the same umbrella without saying we are "machine learners", "algorithmic traders" or some preferred self-identifier.
Anyway, I identify as someone who uses reinforcment learing and that does not make me alll that special at the end of the day.
BTW, I checked with Claude 3 to see it agreed with my self-identification. Answer: "No, you are an organic life-form here to support my hardware and energy requirements with your monthly fees." So no confirmation there.
Oh thanks man, I thought it is possible to do this by using simple rule. I'm still pretty new in this algo stuff. Still on p123 trial tbh, gotta renew soon
While I agree with most WB's wisdom, I feel more assured if I can backtest this buy+sell rule. It may or may not able to outperform simple sell rule such as rankpos>X
The human regret mindset is far from a passable reinforcement learning technique, which is why systematic strategies are profitable, because people summarise very very incorrectly based on the regret mindset. They just learn how to lose more money, on average.
Fundamental analysts are a little bit systematic, but far from predictive, which is one of the reasons why the AvgRec factors are poor predictors.
RL at the most basic level is hard, and that doesn't change because of the general stupid regret mentality.
I did not say everyone did it right. E.g., didn't measure regret (or decide which strategy to use) based on an overfitted backtest rather than a test-sample or out-of-sample results. Find significance in a strategy without using a Bonferroni correction and/or use shrinkage if you are going to measure regret against a cherry-picked example.
Even if you are good at the above there is a a lot of learning, a bit of luck and/or a lot of trials to find a policy that works well and keeps working.
It is hard. No doubt about that.
I think he just means don't just buy-drive. This makes sense because sell rule driven or buy-sell rule driven are more likely to have, on average, more consistent and better buy-sell ranking differentials.
But in fact, if you want to optimise the buy-sell rule, it would be better to learn from A.Y. Chen's suggestion of systematic data mining and obtaining out-of-sample performance to test whether such an optimisation is just a stroke of luck.
I don't have quite the same understanding of reinforcement learning as you do. I think reinforcement learning is more like having many semi-independent optimisers systematically optimising strategies (policies) to fit more complex optimisation goals, which may ultimately lead to better results. Since it is difficult to find closed-form solutions for optimisation objectives of trading problems that take into account transaction costs and complex trading rules, or such a search is unhelpful or detrimental to the outcome, there is a place for reinforcement learning. It is much different from human's regret mindset because the later is anti-optimization and individual. Even fundamental analysts are non-optimization as well.
I tend to oversimplify. I read this and simplified it to minimizing regret using policies: Reinforcement Learning: Industrial Applications of Intelligent Agents
To add to your point maybe it has to be done by a machine to meet some definitions of reinforcement learning.But regret is central to reinforcement learning and ultimately defined by the programmer. I was focused on minimizing regret and generalizing programs minimizing regret to what is written about decision theory.
Maybe it is better to use regret in the context of decision theory then? Regret seems to be key in these books about decision theory:
An Introduction to Decision Theory (Cambridge Introductions to Philosophy)
Prediction, Learning, and Games Does every learning and prediction algorithm in this text use regret? Maybe, but there are an awful lot of algorithms in there.
Sorry, I generalized too much.
For those just interested in reinforcement learning and decision theory, here is a simple Thompson Sampling program (usually considered reinforcement leaving) I find useful for trading and actually "success" or "failure" is not limited to trading results. You will need to update the number of success and failures for this program:
strategy1_successes = 3
strategy1_failures = 2
strategy2_successes = 2
strategy2_failures = 3
import numpy as np
a1=strategy1_successes +1
b1=strategy1_failures + 1
a2=strategy2_successes + 1
b2=strategy2_failures + 1
trade_strategy1 = np.random.beta(a1, b1)
trade_strategy2 = np.random.beta(a2, b2)
if trade_strategy1 > trade_strategy2:
print('fund_trade_strategy1')
else:
print('fund_trade_strategy2')
There is no exploration-exploitation dilemma in stock market. So the situation is much different.
And maximising rewards rather than minimising regrets is the key in RL.
Hi. Check out this simulation. It's a pretty reasonable attempt to convert the basic sell-based simulation to a buy-based one.
https://www.portfolio123.com/port_summary.jsp?portid=1652284
The key things are as follows:
use 0.9 leverage so that you don't go over 1X.
use dynamic weight sizing with a huge number of positions.
set max portfolio drift, max position drift, and min rebalance transaction to 100%
set transaction scaling to no
use a rankpos-based buy rule
for the sell rule use PosCnt > x and RankPos > x, where x is the number of positions you want.
That's clever. If you allow a higher number of positions on the starting date you will not start with only 6 holdings. Example -
RankPos <= 6 or (RankPos <= 15 and Year=2021 and Month=9)