The Law of Large Numbers
When and why sample averages converge to the true mean — and when they don't.
The Law of Large Numbers (LLN) is one of the most intuitive results in probability: if you take more and more samples, the average converges to the true mean. It's why casinos always win in the long run and why polls become more accurate with larger sample sizes.
But this comforting result has a critical assumption — one that fails spectacularly for fat-tailed distributions.
The Law of Large Numbers
Law of Large Numbers (Strong Form)
If are independent and identically distributed (i.i.d.) random variables with finite mean μ, then the sample average converges to almost surely:
This seems almost obvious — isn't this just what "average" means? The power of the theorem is that it tells us the convergence happens with probability 1, regardless of the specific distribution (as long as the mean exists).
The Critical Requirement
The LLN requires — the mean must be finite. If this condition fails, the entire theorem breaks down.
For the Pareto distribution with tail exponent α:
The mean only exists when . Specifically:
When LLN Fails
The Sample Average Doesn't Converge
If (infinite mean), the sample average does not converge to any finite value. No matter how many samples you collect, the average keeps jumping around, dominated by occasional extreme observations.
Wealth Distribution
Consider a stylized model of wealth where the distribution follows a Pareto with . In this case:
- The theoretical mean wealth is infinite
- If you sample 1,000 people, the richest person might have more wealth than the other 999 combined
- Sample another 1,000, and the average could double or halve based on a single new observation
- The sample mean never "settles down" — it keeps wandering
This isn't a sampling error or bad luck — it's a fundamental feature of the distribution. The LLN simply does not apply.
Interactive Demonstration
Explore the Law of Large Numbers with different distributions. Compare how sample means converge (or fail to converge) for Gaussian vs. Pareto distributions:
Convergence observed: The Gaussian distribution has finite mean and variance. Sample means converge quickly to μ = 0.
Each colored line represents an independent simulation. As sample size increases (x-axis, log scale), watch how the sample means behave.
For finite-mean distributions, all runs converge to the same value. For infinite-mean Pareto (α ≤ 1), the means keep jumping — a single extreme observation can dramatically shift the average.
Weak vs Strong LLN
There are actually two versions of the Law of Large Numbers:
Weak Law of Large Numbers
The sample mean converges in probability to the true mean:
Read: “The probability that the sample mean differs from mu by more than epsilon goes to zero”
Large deviations from the mean become increasingly unlikely
Strong Law of Large Numbers
The sample mean converges almost surely to the true mean:
Read: “With probability 1, the sample mean converges to mu”
Convergence happens on essentially every possible sequence of samples
The strong law implies the weak law, but not vice versa. For fat tails with finite mean but infinite variance (), both laws still hold — but the rate of convergence becomes extremely slow.
Rate of Convergence
Even when the mean exists, the speed at which the sample average converges depends on whether higher moments exist.
Convergence Speed Under Fat Tails
For thin-tailed distributions (finite variance), the error in the sample mean shrinks like . But for fat-tailed distributions with infinite variance, convergence is much slower.
With (finite mean, infinite variance):
For , this is instead of — you need about 8 times more data for the same precision!
| Distribution Type | Convergence Rate | Samples for 10x Precision |
|---|---|---|
| Gaussian (finite variance) | 100x | |
| Pareto with | 1,000x | |
| Pareto with | ~10 billion x |
Key Takeaways
- The Law of Large Numbers guarantees that sample averages converge to the true mean — but only if the mean exists
- For Pareto distributions with , the mean is infinite and the LLN does not apply
- When the mean exists but variance is infinite (), the LLN holds but convergence is extremely slow
- You may need exponentially more data under fat tails to achieve the same precision as under thin tails
- A single extreme observation can dominate the sample average, even with large