You might think that could become abstruse but the path is very very simple, so simple you might think there is nothing here. I assure you there is something here. So let me take you on the intellectual journey we took together.
It was all about Darius, the Persian. Thanks, Darius! So Darius educated us today about the harmonic mean. I've heard about the Golden Mean, but this is different.
You know all this already, of course (patience will be rewarded soon!).
Consider three approaches to extracting the Typical from some outcomes (say, a set of N numbers { xi | 1 ≤ i ≤ N }), using Plus (which includes Minus where xi < 0) and Times, and Divide. (These are also called the Pythagorean Means.)
Combinator | Common Name | Technical Name | Formula |
+ | "average" | Arithmetic Mean |
$$ \frac{1}{N} \sum_{i=1}^{N} x_{i}$$
|
* | N/A | Geometric Mean | $$ \sqrt[N]{\prod_{i=1}^{N} x_{i}}$$ |
/ | N/A | Harmonic Mean | $$ \frac{1}{\frac{1}{N}*\sum_{i=1}^{N} \frac{1}{x_{i}}} $$ |
The Harmonic Mean is best for an electronic circuit with resistors in parallel: the performance of the whole circuit is the harmonic mean of its parts. It's also great for estimating "true" travel durations: The smallest xi dominates the results. E.g., suppose one driver takes 2 minutes, and another takes 100, then the harmonic mean formula is
$$ \frac{1}{ \frac{1}{2} * (\frac{1}{2} + \frac{1}{100}) } = 2 * \frac{1}{ 0.5 + 0.01 } = \frac{2}{ 0.51 } = 3.92 $$3.92 is a lot closer to 2 than it is to 100. In general this is is appropriate if the deviations are mostly on the delay side and If delays are less informative. Maybe the 2nd driver stopped at McDonald's on the way. Unusually long durations should indeed be discounted. Whereas driving times still don't go to zero even if you drive at 100MPH. The shortest times are on average the truest, in some sense of average. The correct sense is "the harmonic mean."
Contrast that with the Arithmetic mean, where neither small nor large dominates, all count equally (including negatives!), just add them up and divide by N, that's the 'average'.
We will think about this in terms of gambling, because life is a random sequence of bets on unknown outcomes, and intellectual responsibility means thinking about the unknown as being unknown, and therefore very much, exactly, like gambling. Not as in, irresponsibly, but as in, actually betting your life and not actually knowing what the outcome will be. We want to be good gamblers, is what I'm saying, because we are all betting our lives: we ARE gamblers, and confronting that fact will let us think more clearly.So what do gamblers do to figure out the averages? To calculate the "average" casino day, drive two busloads to the casino, 100 tourists, each with $100 to bet, and look at their wallets at the end of the day. A few will win, most will lose some, and some will lose everything and have zero left. Add up the total, divide by the number of people, that's the average return on going to the casino with $100. Maybe the average is $75, or $50, but it's surely less than $100 unless the casino had a bad day, and it's surely more than $0 unless there was a robbery. Let's call that the one-day, Holiday, or Tourist's average, it's also just the "average", technically speaking. (Context will help determine what "average" is intended to mean in different situations; it's not always this one.)
Indeed the Tourist's Average is very different from the Investor's or Gambler's Average. The gambler goes to the casino 100 days in a row, and takes all his or her winnings from one day and plays them the next day and then again the next day. At some point s/he is going to have a bad day and hit zero. Then the rest of their casino adventures are in the bar and the restaurant, because the betting wallet is empty. The gambler's average return is the geometric mean: MULTIPLY all the returns, and take the Nth root of the product, that's the Gambler's Average. Because if there is a single zero in that list, then the whole result is zero.
Thus with the geometric mean, similar to the harmonic mean, the smallest also dominates the results. Suppose you take the Gambler's Average of a 50% loss and a 50% gain.
In exactly this way, arithmetic vs geometric mean differ for investment gains. It turns out 50% loss is not equal to 50% gain, so you can't just add them up like a tourist, your "average" of 100% return of capital might be an ACTUAL loss, because your two successive bets returned +50% and -50% with the result being a final cumulative return of 0.866 - 1 = -13.4%. Failure to lose ought to mean 100% return of principal; but if you had a brain fart and a 50% loss, now you need a 100% gain, Mister Gambler.
Okay this has been fun, and yes now you are better armed to not be taken advantage of by hedge funds and investment advisers that advertise their average returns, (or by Snopes when they misrepresent pharmaceutical results).
Right? Let's make up a nice clear case. Suppose they made one bet in January, and lost 50%, then another bet in February and March together, and gained 75%. Then on April 1 your million dollars invested just became \( 1M * 0.5 * 1.75 = 0.875M \), which is a 1/8th or 12.5% cumulative LOSS, but they can advertise to their new customers, Hey, we had some losses, some gains, but our "average" return was (-50% +75%)/2 = +12.5% GAIN. April Fools!
So no, when you are dealing with cumulative results, use the geometric mean. Unfortunately humans prefer to think in terms of the additive or arithmetic mean, which is what gives the meaning to April Fool's Day.
Don't be a fool. If you are in cumulative mode, like the gambler who bets his wallet day after day, then use the geometric mean. And we are all betting our whole wallet, day after day.