Does finding Optimal-Kelly for a portfolio (irregardless of the any leverage considerations) also give you the best Sharpe Ratio?

All,

ChatGPT 4 gets you there, I believe. It did make an error that I identified in brackets[] below but I think the conclusion is still correct. Here is what I asked it:

Q: [After some discussion of continuous Kelly and expected returns that I leave out for brevity]: Is it true from the formulas that optimal Kelly maximizes the Sharpe ratio? I understand that would require a mathematical derivation possibly including some calculus perhaps. Can you try to derive that?

ChatGPT 4’s answer: "Yes, the Kelly criterion aims to maximize the expected logarithmic growth rate, which can be related to the Sharpe ratio. While I won’t provide a full derivation, I’ll give you an outline of the relationship between the two concepts.

First, let’s consider a simplified version of the Kelly criterion formula for a single investment:

Fraction to bet (f) = (Expected return - Risk-free rate) / Variance

Now, let’s recall the definition of the Sharpe ratio:

Sharpe ratio = (Expected return - Risk-free rate) / Standard deviation

Notice that the numerator in both formulas is the same, and the denominators are related (Variance = Standard deviation^2).

Now, let’s consider the expected growth rate of a portfolio when following the Kelly criterion. The expected growth rate, g, can be expressed as:

g = (Mean return - Risk-free rate) * f - 0.5 * Variance * f^2

To maximize the expected growth rate, we can differentiate g with respect to f and set the result to zero:

dg/df = (Mean return - Risk-free rate) - Variance * f = 0

Solving for f, we find:

f = (Mean return - Risk-free rate) / Variance

This formula for the optimal Kelly fraction (f) is identical to the one we started with. However, we can also express it in terms of the Sharpe ratio:

f = Sharpe ratio * Standard deviation [ My addition: This is an error it should be Sharpe ratio/Standard deviations.

The next sentence is true nonetheless.]

Thus, the optimal Kelly fraction is directly proportional to the Sharpe ratio [My edit: assuming a constant mean and standard deviation]. This relationship suggests that a strategy with a higher Sharpe ratio would generally lead to a larger optimal betting fraction according to the Kelly criterion, resulting in a higher expected growth rate. Therefore, the Kelly criterion can be understood as an attempt to maximize the Sharpe ratio, taking into account the risks and returns associated with a given investment strategy."

The weakness in the above, I think, is that the standard deviation and the means are not a constant. If you use calculus for this I think the derivative or partial derivative is not that simple. Maybe requiring differential calculus or a check of the calculus at a minimum and my calculus is pretty rusty. On reflection, I think maybe ChatGPT got it wrong or I am not completely convinced without checking the calculus.

Jim

So this is actually trivial and does not require calculus although one could do it with calculus, no doubt.

Kelly attempts to maximize growth which is given by:

g = risk free rate + (Sharpe ratio ^2)/2

When you succeed in finding optimal Kelly then you maximize g. Because of this simple formula it is obvious you are maximizing the Sharpe ratio when you maximize g (both being positive numbers and monotonically increasing).

This seems conclusive but troubling. All those years where Markowitz and Thorp debated the value (or lack of value) of the Kelly- criterion this connection was missed?

Oh well, ChatGPT 4 finally got it. I don’t see how it could be wrong within the relatively simple assumptions of the derivation of g in this formula (the usual Gaussian distribution etc)

This is not just an academic exercise for me. Portfolio Visualizer has some limitations in building a portfolio. I think there are some better ways that are not that hard mathematically, it seems. Several ways it seems now. That would be for another post, I think.

Jim

I’m sure this has been answered before but, what is the P123 formula for the kelly criterion?

Tony,

I am not sure that this answers your question. But for the continuous Kelly-criterion it is μ/σ^2 or the mean divided by the variance. With the expect growth being (μ - r)^2/2σ^2 + r.

Where r is the risk-free rate, μ is the mean and σ is the standard deviation.

Here is a resource with those equations in it: Practical Implementation of the Kelly Criterion: Optimal Growth Rate, Number of Trades, and Rebalancing Frequency for Equity Portfolios

I note that the authors of this paper DO NOT agree that the Kelly criterion optimizes the Sharpe ratio. I will say I am trying to understand that without a lot of comment here. Just trying to learn without recommending any particular portfolio to anyone.

But here is a source that say what ChatGPT and I are saying: Kelly vs. Markowitz Portfolio Optimization

Specifically, the above source says: “According to Kelly and Ed Thorp (and explained in my book), F above not only maximizes the compounded growth rate, but it also maximizes the Sharpe ratio.”

Also: “Put another way: the maximum growth rate is achieved when the Sharpe ratio is maximized.” Which is what ChatGPT said when it said this: “Therefore, the Kelly criterion can be understood as an attempt to maximize the Sharpe ratio,…”

Tony, I know you code Python well. Here is a resource with some code: How To Improve Your Trading System With The Kelly Criterion

I will note that calculation of portfolios that maximize the Sharpe ratio are prone to errors due to issues with the inverse-matrix stability and the critical line algorithm (CLA) that they use. So the first paper could have (probably would have) found a different portfolio because of this issue (in part at least).

And I think that is one of the issues I am having with Portfolio Visualizer. Often the portfolio that purports to give the optimal Sharpe ratio does not have the best Sharpe ratio even with a backtest that has all of the historical data.

Anyway, I think it is an honest question. I could DEFINITELY be missing something.

I hope that goes toward answering your question while I still ponder this without a definite answer.

**Edit and Warning: rehash of above. So surprising that I WANT CONFIRMATION FROM MULTIPLE SOURCES. But also the proof is so simple. So simple that I do not understand why Ed Thorpe has not said this (which he may have but i do not have the quote). Which really makes wonder. These continuous-Kelly formulas are in his book so I really wonder.

TL;DR: I still think I might be wrong but maybe not crazy to still wonder on this.

Q: I am still trying to get something definite on whether finding the optimal-Kelly mix of assets maximizes the Sharpe ratio. It does seem to me that what you said is correct . That is that for continuous Kelly the growth formula is g = 1/2 * (Sharpe ratio)^2 and that optimal-Kelly maximizes growth. So it seem that with a constant (i.e., 1/2) you are maximizing the Sharpe ratio squared. Assuming the Sharpe ratio is positive then both the sharpe ratio and the sharpe-ratio-squared are positive and monotonically increasing. Therefore maximize the growth should maximize the Sharpe ratio. Do you agree and do you wish to expand on what we have both said?

ChatGPT 4 answer: "Yes, I agree with your understanding. Based on the continuous Kelly growth formula, g = 1/2 * (Sharpe ratio)^2, it appears that maximizing growth would be equivalent to maximizing the square of the Sharpe ratio. Assuming the Sharpe ratio is positive, both the Sharpe ratio and its square are positive and monotonically increasing. Therefore, maximizing growth should indeed maximize the Sharpe ratio.

It’s important to remember that this relationship holds under the assumptions we have discussed, such as constant mean and standard deviation in the future. Additionally, maximizing the Sharpe ratio does not necessarily mean that the portfolio will have the lowest risk; rather, it means that the portfolio will have the best risk-adjusted return. In practice, investors may have different risk preferences and may choose to prioritize other portfolio characteristics, such as minimizing drawdowns or maintaining a lower level of volatility."

The debate is finally over. The Kelly-Criterion and mean-variance optimization people are going to join forces? :thinking: :rofl: I get it; I missed something, but what?:worried:

Jim

Jim -

Perhaps an example might illuminate something. Try calculating the Sharpe ratios and the Kelly criteria for the following groups of ETFs. Group 1: OPER, TFLO, BIL, USFT, CLTL. Group 2: ERX, DIG, MEXX, CANE, FCG. And try basing those calculations just on the last two years’ returns.

The first group has extremely high 2-year Sharpe ratios (between 3.8 and 6.3, according to a screen using Sharpe2Y) and extremely low returns. The second group has middling Sharpe ratios (around 1) and very high returns (all over 65%).

I haven’t done any of the calculations myself, but I’d be curious what you come up with.

Yuval,

Thanks I will do it. Just for clarification of the problem: Without being able to use leverage, Portfolio Visualizer will simply pick the single stock with the highest returns (almost universally). Or for ETFs if you put QQQ into it, you will end up being 100% QQQ (much of the time). It will do that for the Kelly-criterion.

And for maximizing the Sharpe ratio it might do the same (probably would actually). Both methods have the problem of producing portfolios that are too concentrated, too dependent on past returns, and not really being risk controlled (even with historical data).

As bad as that is, they have a worse problem: they are disasters out-of-sample!

So, I hope at least, I am not looking at the standard approach for either. FWIW, I think there is a better way that I have not really discussed. Not so far anyway.

I am more looking for the mix of assets that puts all of them at say 1/10 fractional-Kelly. Fractional-Kelly so that they can all be levered up to 100% Kelly (or not). Keeping all assets at 1/10 Kelly (without any leverage) should be pretty low-risk and compound nicely. With some mix of assets anyway.

But not just optimal-Kelly for each asset. Really I want optimal-Kelly with the correlation of the assets in mind. This is what is used in the denominator of the above equations: The standard deviation with the correlations taken into account. This much is pretty much standard for either method (I am not doing anything new with that).

There is room there for that to be a very nice risk-control measure (e.i., all assets at 1/10 optimal-Kelly with correlations considered). Nice risk control whether it happens to meet the approval of the CAPM advocates or not. And offer advantages including not being portfolios that are so concentrated. There are other advances—especially if you shrink the expected returns (e.g., with Bayesian statistics) to get more realistic expected returns. More realistic and less dependent on the historical return data.

That having been said, I do not claim to be doing anything else in this post other than better-framing the question.

I do not know the answer. But also, not so trivial to calculate this with the correlations (Portfolio Visualizer comes that this in a different way). It can be done but probably not today.

TL;DR: Great question! Wish I had a quick answer. FWIW, this can probably be tested and even cross-validated. Not necessarily this week even.

Jim

Ok, I can’t add much to this conversation except for this blog post. It manages to mention partial-Kelly, Sharpe ratio and Markowitz’s frontier. Can’t vouch for its correctness, so reader beware.

Thanks Walter. That definitely addresses the question. FWIW, this author is in the camp that partial-Kelly will not maximize the Sharpe ratio, I think. But also he likes the risk characteristics of partial-Kelly which would be fine with me (fine either way if it works).

Nice!

Yuval and anyone still reading this,

Yuval, I did not address you question directly but have some insights. First, my approach would maximized μ/σ^2 FOR EACH STOCK. But the standard deviation could be greater with my method than if I tried to optimize the overall variance (or Sharpe ratio). Something I was a but fuzzy on at the time.

Without going into all of the mathematical details, I have something from Portfolio Visualizer that could help illustrate–if not answer fully. I sought to optimize these assets: XLE XLU XLK XLP XLB XLY XLI XLV XLF TLT. They underperformed the SP 500 as one might expect largely because TLT has underperformed the SP500, I think.

My strategy also underperformed Portfolio Visualizer’s “Max Sharpe Ratio” but that was because it had a higher standard deviation. The average return of my strategy was actually the same!!! I.e, the volatility drag made it underperform slightly. And again, because I was not optimizing the overall standard deviation but rather the standard deviations for each asset.

Anyway, the long and short of it is that what I was proposing is different than maximizing the Sharpe ratio. Full stop. No hand-waving. No excuses after the fact.

That having been said, it just barely underperformed the SP500 despite having TLT in the assets and had a better Sharpe ratio. And had some advantages over PVs Sharpe ratio as far as not being as concentrated in a few assets.

It may or may not have some potential when used with recent momentum of the assets and/or Bayesian shrinkage (potentially giving better out-of-sample estimates of returns). BUT NOT THE SAME when implemented in the way I was thinking for the reasons stated above.

Thanks everyone for their input. I was missing something and I am glad to find out what it was.

Jim