The Hedge Algorithm and Expert Advice

We all follow expert advice. Maybe you follow your own (expert) advice. Even if it is a non-discretionary quantitative algorithm created by a Ph.D. mathematician then he/she (the Ph.D) can be called an expert whose advice you are following.

There is a whole branch of mathematics that gives you expert advice on the best way to follow expert advice.

Imagine that after a bunch of backtests and research, you want to start out following the advice of 5 ‘experts’ including your dad who recommends a 60/40 allocation of stocks and bonds. Maybe you follow another ‘expert’ by buying equal amounts of his designer models made available at P123. Maybe you mix some designer models with some ETFs. Maybe you mix some of your own ports with some ETFs.

Each combination (of ports, designer models and/or ETFs) is a strategy and constitutes an ‘expert’s advice.’ And hopefully, you have designed the strategies so that it is likely that they will have acceptable risk going forward.

For simplicity lets say you start out investing equal amounts (20% of your capital) in each of the 5 strategies.

Going forward, do you “stay the course” and never alter your allocation? If you adjust the allocation of your capital based on the performance of the strategies, how do you do it?

There is a whole set of algorithms that try to give you the minimum amount of total regret when you are done. What is interesting about these algorithms is they make NO STATISTICAL ASSUMPTIONS. In essence they allow the market to do weird things–starting with not following a normal distribution (or any distribution for that matter)!!!

One of the better know algorithms for investors is the Cover Universal Portfolio. Here is a link to his original paper: Universal Portfolios

Here is the first line in the abstract:

"We exhibit an algorithm for portfolio selection that asymptotically outperforms the best stock in the universe.

Whoa!!! OMG!!! Are you serious?

So this is a serious paper and a serious claim. In real life there is the problem of trading costs and what exactly “asymptotically outperforms” means. Plus, it takes a truly great computer to run it for the 500 stocks in he SP 500 (cannot be done). There are also serious attempts to address these issues.

But the Hedge Algorithm is easier and it has some good theoretical bounds on how much you are going to regret the allocations you made on the 5 strategies example that I started with above. And it does not seem too weird intuitively.

It gradually reallocates more resources to the better performing strategies like you would want. And you can even adjust how quickly it adjusts these allocations. This is the learning rate (or η) in the Wikipedia link below.

Probably easiest to read about it here in (Wikipedia near the middle of the article): Multiplicative Weight Update Method

Skip to the formula at the end of the section on the Hedge Algorithm in Wikipedia.

The equation may look a little difficult but in reality it only has 2 main complications:

  1. The learning rate (η). Just pick .5 and change that later if you want.

  2. The cost of the decision (m also called regret). Each period (say a month), each of the 5 strategies will have a regret. Your best performing strategy has zero regret because you would not have regretted putting all of you money into that strategy.

If one of your strategies underperformed the best strategy by 5% that month then your regret is 0.05 for that strategy (for that month). Simple. Plug it into the formal to figure out how much weight you want to put on a strategy (maybe normalize the weights).

Is this a reasonable strategy? Better strategies? How do you readjust your asset allocation based on performance?

I can divide all of my P123 strategies and ETFs into 2 sets (2 strategies) that have good risk characteristics and call them 2 diversified portfolios, strategies or expert’s advice. And begin to adjust the weights of these 2 portfolios (strategies) according to their performance.

I can have a mathematically guaranteed limit on the total regret I will have when I retire and look back at what if have done. Cool math and kind of makes practical sense, I think. Should I use it? Uhhh…Let me think about that. And let me know what you think.


Hi Jim,

Did you try to tackle any of this in a practical way? I mean, have you developed anything on P123 (or anywhere else) that puts this to use? I started reading it and trying to make sense of it, then I decided to start over because he (Cover) lost me by the end of the first page (i.e., the title page). lol :joy:

Finally, I decided to go slower, search the text for anything of value. However, I haven’t encountered math this complex since I got a university degree in physics back in the prehistoric days. But ever since that degree I have been trying to dumb myself down by living life in the ‘real’ world.

So I’m wondering, are you aware of anyone out there who is using any of the concepts in this paper to make actual money – or is it just a circle jerk by someone trying to impress other geeks? I’m just wondering if there is any justification to soldier on through it, trying to extract something of value?

EDIT - I just re-read your post and see that my question to you is actually the question you are asking the community.

I note that you say: [quote]
One of the better know algorithms for investors is the Cover Universal Portfolio.

I’m curious what made you say this? Are you actually aware of any firm or individual who is using these maths? I’ve been in the industry for 40 years, but I have never heard of it. But maybe I’ve been hanging out in the wrong off-campus bars… :face_with_monocle:


Thank you for your interest.

You may not believe me now but I am absolutely certain that you can do the math. Some of the set nomenclature is in there just because it has to be for a publication, I guess. And you can certainly ignore the proofs. I would move beyond that if you are interested.

That having been said, I have been looking at the Cover algorithm more since my first post on this. I do not think that is useful for most of us. Cool theory but to be truly useful for us the assets have to be very volatile, and uncorrelated. I think the author uses hindsight to find an example of assets where the algorithm is beneficial. You might consider using the Cover algorithm if you plan on using a lot of leverage with uncorrelated ETFs (probably should). Most of the time the hedge algorithm is just as good for us.

Beyond the difficulties of reading through the papers the hedge algorithm is very easy and can be entered into a spreadsheet. A spreadsheet being just as good or even better than Python.

w (this month) = w(last month)*exp(η * underperformance_of_port). Of course, you will normalize the weights.

w (this month) = weight you should place on a port this month.

w (last month) = the weight you placed on the port last month

underperformance_of_ port is just how much the port underperformed the best port that you are funding (over the last month). So one of the ports has no underperformance

η is calculated with an easy equation that I will not show here. But I can post if you are interested.

Each month, you adjust the weight of your models according to this simple equation and normalize the weights. It reduces the weight of the ports that have not been doing so well in a rational manner.

I might use this. It is CERTAINLY the most rational way to adjust the weight of ports if you have no a priori belief about which is the better port.

Honestly, it could be done in 5 minutes once a month. And it gives you a mathematical guarantee of how much you are going to regret initially funding that dog-of-a-port you thought was so good. The regret is minimized because it reduces the funding of that port in the optimal manner based on the evidence as you collect more data.

Of course, your good ports continuously receive more funding that is taken from the ports that are doing poorly.

Do you need this? Just think of that port you regret ever putting any money into. Did you get out in a graceful manner with the least amount of loss (regret) possible? Did you reduce funding the first month it started to under perform–automatically reducing the funding of that port each month it continued to underperform? If you did you probably do not need this.

Not sure who might need this. But I probably do.


This “hedge algorithm” doesn’t seem to compensate for the law of regression to the mean. Statistically, the portfolios that perform the best/worst in a given period are the most likely to underperform/outperform in the next. Of course, there can be good reasons for that not to happen . . . But at any rate, increasing the allocation to a portfolio that’s outperforming over a short period of time doesn’t make statistical/probabilistic sense to me, and it also goes against the principle of diversification, which would favor rebalancing to equal weight.

Have you looked at whether using this method backtests better or worse than rebalancing to equal weight? If you use individual stocks as proxies for portfolios, certainly equal-weight rebalancing would be better than favoring one-month performance. On the other hand, perhaps if you used industry or sector ETFs you’d get the opposite result. Just curious.

Hi Yuval,

So actually it does. That is part of the reason it exist. Let me try to give an intuitive sense of why.

Another algorithm people use is called “Follow the leader” which is highly sensitive to regression-to-the-mean. I think you will agree with me on that. And see why the hedge algorithm is at least better.

“Follow-the -eader” simply puts all you money in the algorithm that has done the best to date. Kind of like putting all of your money into QQQ in the year 2000 because you have been following it for the last year and it has been beating SPY. So I agree regression-to-the-mean is a big problem, some algorithms are particularly problematic with “Follow the Leader” perhaps being the worst.

Also ‘follow-the-leader’ is what a lot of people do and they could do better. Those putting all of their money into QQQ in 2000 might agree with both of us. Although they did get their money back around 2015 if I read the chart right.

η is the learning rate and the time-horizon plays a role in its calculation. It plays a big part in how susceptible the algorithm is to regression toward-to-the-mean problem. If you started using the hedge algorithm in 1999 with a 10 year retirement horizon, using ETFs like SPY and a few ports, regression-to-the-mean of QQQ would not have been much of a problem.

But you would have gradually shifted your money to the better models. Which obviously would be a good thing if you had a great port. Way better than keeping the weight of the port equal weight.

That having been said regression-to-the-mean is never a good thing and it cannot be avoided entirely. There is a tension between wanting to money into a good port and the possibility of getting burned if the port has regression-to-the-mean. Mathematically, this is the best way to manage that. And obviously better than “follow-the-leader.”

I am not recommending that anyone not use their common sense or ignore other factors as far as what they use in the hedge algorithm. This is for ports and ETFs for which you have no a priori expectations including your beliefs about regression to the mean.


Thank you, Jim. I would be interested in how to calculate η . . .

Thank you, Jim.

And Yuval, that was going to be my next question. Thanks.

Yuval and Chris,

I should say that the regret (underperformance_of_port) needs to be scaled between 0 and 1. It can be 0 or one (i.e., interval = [0,1]).

My ports (with rare outliers) have losses no-greater-than 10% in a week (you can use weeks instead of months). So I scale it such that an underperformance of 1% = .1, 2% = .2, …, 9% = .9, 10% = 1.

This also serves to trim the outliers. Any losses greater than 10% are trimmed to 1. Obviously, you will find the scaling method that work best for your ports and time-periods.

If you have a plan for how long you want to use the hedge algorithm (e.g., until you retire):

η = -1*(8 ln(N)/n)^.5

N = number of ports you want to include

n = number of months until you retire (or the period you plan to use the hedge algorithm)

If you are not sure how long you will be using these ports with the hedge algorithm:

η = -2 * (ln(N)/t)^.5

In this case t is the number of months you have been using the hedge algorithm and η changes each month.

Here is a link to check my work (go to slide 34/140): Hedge Algorithm. I believe that when the link uses log they mean ln or the natural log.

Also please note that the the equation is often written with a minus sign before η in the literature (which I did not do above).:

I.e., w (this month) = w(last month)*exp(-η * underperformance_of_port) is the way it is in the literature.

Using the equation–as it us most often written in the literature–there would be no minus sign in the η constant that I give above. Or you would multiply the η I give above by -1. The way I did it is correct but confusing, I think. I am sorry I did it that way.

If you look at the link and see that I made any errors please let me know. I will run back through it in with a couple of sources before I use it with funded ports (I think I will). Ideally, you would check my work before you dedicate any money to this. But once checked and confirmed to be correct to your satisfaction, I think it is easy enough to be used.

I can also copy and paste sections from a text if anyone is interested and wants confirmation from a text (hopefully, well edited and authoritative) on any of this. Maybe translate some of the jargon too. Text I use: Prediction, Learning, and Games by Nicolo Cesa-Bianchi (Author)

Here is the first example of η copied from the text:

“with η = √8 ln N/n, the upper bound becomes √(n/2) ln N.” No minus sign as discussed above. [color=firebrick]The upper bound is the maximum amount of regret you will have even if Ms. Market turns out to be the devil herself.[/color] Seems like the stock-market-gods have made it such that even the devil (and Ms. Market) are constrained by the math as far as how much they can mess with you if you diversify and use the hedge algorithm.

Cesa-Bianchi, Nicolo; Lugosi, Gabor. Prediction, Learning, and Games (p. 16). Cambridge University Press. Kindle Edition.



This is an example of ADVERSARIAL reinforcement learning.

All that talk about a potentially hostile Ms. Market was not made up out of think air.

In ADVERSARIAL reinforcement learning the adversary is allowed to make up any strange market behavior she wants BEFORE THE START OF THE GAME.

  1. She does not get to change things after the start: she cannot learn from and exploit the algorithm you are using. Realistic for a small retail investor.

  2. She can mess with you as much as she wants with as much trending or reversion-to-the-mean behavior to throw you off your game as she desires.

  3. Each expert you use for the algorithm can have a varying edge, no edge at all or be 180 degrees wrong in her recommendations.

  4. No rules (or assumptions) about normality, continuity etc.

  5. Past history of the market is not involved in this but should be used to determine which sim you convert to ports to be used with the hedge algorithm, IMHO.

  6. One can debate how adversarial the market is but in a truly adversarial market this is the best way to go without any question. I will spare you the theorems.

In summary, the best way to play a game like the real market and have as little regret as is mathematically possible. Truly worth considering IMHO.


Theorem: Asymptotically, the per period regret–or per period loss compared to the best strategy included in the Hedge Algorithm–approaches zero.

Yuval and Chris,

I just wanted to show you how trivial the proof of this last is. I thought you would appreciate the proof even if you think using the hedge algorithm in practice is pure BS (which it may be). This will be a proof for 2 strategies to make it simple.

So assume the authors are correct and the limit of your regret is: √(n/2) ln N for the hedge algorithm. This is the regret or underperformance of the hedge algorithm compared to the best strategy used in the hedge algorithm

n is the number of periods (e.g., months) and N is the number of tickers (set to 2 for my proof).

For 2 stocks your total regret is: √(n/2) * 0.69. (by assumption: author’s equation for 2 stock)

Right? So here is the rest of the proof.

√((n/2) * 0.69) is your total regret after n periods. (restatement of assumption). Also written: ((n/2) * 0.69)^0.5

√((n/2) * 0.69)/n is you per period regret. Just divide the above equation by n to get this (trivial use of algebra). Also written (((n/2) * 0.69)^0.5)/n to clarify that this equation does not take the square root of the denominator.

Your per period regret approaches zero as n becomes large. Because you are taking the square root of n for the numerator and n for the denominator your per period regret approaches 0 as n becomes large (basic algebra and/or calculus).

End of proof!

Maybe practical; maybe not. But nice proof don’t you think? Simple and maybe even trivial.