# Mean Thoughts

on the Gambler's Mean

## How to calculate the value of a history of outcomes

### Tourist vs Gambler

Pingpong is a rather intellectual sport; at least intellectuals seem to gravitate to it. For example, four Seattle software engineers, Daryoush, Bryan, Kevin, and Tom, ancestrally Iranian, Chinese, African, and Scots, met at a bar after pingpong, then after talking about stroke mechanics, we talked about math and money.

You might think that could become abstruse but the path is very very simple, so simple you might think there is nothing here. I assure you there is something here. So let me take you on the intellectual journey we took together.

It was all about Darius, the Persian. Thanks, Darius! So Darius educated us today about the harmonic mean. I've heard about the Golden Mean, but this is different.

Here is the basic question: you have some outcomes that you measured, how will you summarize them? Many answers: The sum, the count, the average, the median (the value of the middle one after sorting), the mode (the most common value). Did you ever consider the product as a summary, or the log-sum? Sum is such a good summary it is the root of the word "summary"; we use it in financial accounting -expenses + income to see where we are. But sum varies, usually increases, with count or over time and hides speed or rate. Count is good but it doesn't give a typical magnitude, just a event count or duration with each event counting as one each. Average is our bread and butter, $$\mu = sum/count$$, scaling the sum by the count to try to get at the Typical.

But we will consider three approaches to extracting the Typical from some outcomes (say, a set of N numbers { xi | 1 ≤ i ≤ N }), using Plus (which includes Minus where xi < 0) and Times, and Divide. (These are also called the Pythagorean Means.)

 Combinator Common Name Technical Name Formula + "average" Arithmetic Mean $$\frac{1}{N} \sum_{i=1}^{N} x_{i}$$ * N/A Geometric Mean $$\sqrt[N]{\prod_{i=1}^{N} x_{i}}$$ / N/A Harmonic Mean $$\frac{1}{\frac{1}{N}*\sum_{i=1}^{N} \frac{1}{x_{i}}}$$
There are different appropriate contexts for thinking in terms of each type of mean, depending on how multiple values might combine.

The harmonic mean is for an electronic circuit with resistors in parallel. It's also great for estimating "true"travel durations: The smallest xi dominates the results. E.g., suppose one driver takes 2 minutes, and another takes 100, then the harmonic mean formula is

$$\frac{1}{ \frac{1}{2} * (\frac{1}{2} + \frac{1}{100}) } = 2 * \frac{1}{ 0.5 + 0.01 } = \frac{2}{ 0.51 } = 3.92$$
3.92 is a lot closer to 2 than it is to 100. In general this is is appropriate if the deviations are mostly on the delay side and If delays are less informative. Maybe the 2nd driver stopped at McDonald's on the way. Unusually long durations should indeed be discounted. Whereas driving times still don't go to zero even if you drive at 100MPH. The shortest times are on average the truest, in some sense of average. The correct sense is "the harmonic mean."

Contrast that with the Arithmetic mean, where neither small nor large dominates, all count equally (including negatives!), just add them up and divide by N, that's the 'average'.

We will think about this in terms of gambling, because life is a random sequence of bets on unknown outcomes, and intellectual responsibility means thinking about the unknown as being unknown, and therefore very much, exactly, like gambling. Not as in irresponsibly, but as in betting your life and not actually knowing what the outcome will be. We want to be good gamblers, is what I'm saying, because we are all betting our lives: we ARE gamblers, and confronting that fact will let us think more clearly.

So what do gamblers do to figure out the averages? To calculate the "average" casino day, drive two busloads to the casino, 100 tourists, each with $100 to bet, and look at their wallets at the end of the day. A few will win, most will lose some, and some will lose everything and have zero left. Add up the total, divide by the number of people, that's the average return on going to the casino with$100. Maybe the average is $75, or$50, but it's surely less than $100 unless the casino had a bad day, and it's surely more than$0 unless there was a robbery. Let's call that the one-day holiday or Tourist's average, it's also just the "average", technically speaking. Context will help determine what "average" is intended to mean in different situations; it's not always this one.

Indeed the Tourist's Average is very different from the Gambler's Average. The gambler goes to the casino 100 days in a row, and takes all his or her winnings from one day and plays them the next day and then again the next day. At some point s/he is going to have a bad day and hit zero. Then the rest of their casino adventures are in the bar and the restaurant, because the betting wallet is empty. The gambler's average return is the geometric mean: MULTIPLY all the returns, and take the Nth root of the product, that's the Gambler's Average. Because if there is a single zero in that list, then the whole result is zero.

Thus with the geometric mean, similar to the harmonic mean, the smallest also dominates the results. Suppose you take the Gambler's Average of a 50% loss and a 50% gain.

$$G = \sqrt{(0.5)*(1.5)} = 0.866$$
Even though both x1 and x2 are 50% away from the neutral, no-change return, the loser which returns 0.5 of the investment drags the whole result down from neutral even though the winner is identically 50% away from neutral. Indeed if any instance hits zero, then the whole thing goes to zero. $$\sqrt[N]{0 * \prod_1^N{x_{i}}} = 0$$

In exactly this way, arithmetic vs geometric mean differ for investment gains. It turns out 50% loss is not equal to 50% gain, so you can't just add them up like a tourist, your "average" of 100% return of capital might be an ACTUAL loss, because your two successive bets returned +50% and -50% with the result being a final cumulative return of 0.866 - 1 = -13.4%. Failure to lose ought to mean 100% return of principal; but if you had a brain fart and a 50% loss, now you need a 100% gain, Mister Gambler.

Okay this has been fun, and yes now you are better armed to not be taken advantage of by hedge funds and investment advisers that advertise their average returns, (or by Snopes when they misrepresent pharmaceutical results).

Right? Let's make up a nice clear case. Suppose they made one bet in January, and lost 50%, then another bet in February and March together, and gained 75%. Then on April 1 your million dollars invested just became $$1M * 0.5 * 1.75 = 0.875M$$, which is a 1/8th or 12.5% cumulative LOSS, but they can advertise to their new customers, Hey, we had some losses, some gains, but our "average" return was (-50% +75%)/2 = +12.5% GAIN. April Fools!

So no, when you are dealing with cumulative results, use the geometric mean. Unfortunately humans prefer to think in terms of the additive or arithmetic mean, which is what gives the meaning to April Fool's Day.

Don't be a fool. If you are in cumulative mode, like the gambler who bets his wallet day after day, then use the geometric mean. And we are all betting our whole wallet, day after day.

 Your thoughts? Comment: Feedback is welcome.