The Law of Large Numbers

When and why sample averages converge to the true mean — and when they don't.

The Law of Large Numbers (LLN) is one of the most intuitive results in probability: if you take more and more samples, the average converges to the true mean. It's why casinos always win in the long run and why polls become more accurate with larger sample sizes.

But this comforting result has a critical assumption — one that fails spectacularly for fat-tailed distributions.

The Law of Large Numbers

Definition

Law of Large Numbers (Strong Form)

If are independent and identically distributed (i.i.d.) random variables with finite mean μ, then the sample average converges to almost surely:

This seems almost obvious — isn't this just what "average" means? The power of the theorem is that it tells us the convergence happens with probability 1, regardless of the specific distribution (as long as the mean exists).

The Critical Requirement

Finite Mean Required!

The LLN requires — the mean must be finite. If this condition fails, the entire theorem breaks down.

For the Pareto distribution with tail exponent α:

The mean only exists when . Specifically:

When LLN Fails

Key Insight

The Sample Average Doesn't Converge

If (infinite mean), the sample average does not converge to any finite value. No matter how many samples you collect, the average keeps jumping around, dominated by occasional extreme observations.

Example

Wealth Distribution

Consider a stylized model of wealth where the distribution follows a Pareto with . In this case:

  • The theoretical mean wealth is infinite
  • If you sample 1,000 people, the richest person might have more wealth than the other 999 combined
  • Sample another 1,000, and the average could double or halve based on a single new observation
  • The sample mean never "settles down" — it keeps wandering

This isn't a sampling error or bad luck — it's a fundamental feature of the distribution. The LLN simply does not apply.

Interactive Demonstration

Explore the Law of Large Numbers with different distributions. Compare how sample means converge (or fail to converge) for Gaussian vs. Pareto distributions:

Distribution:
# of runs
5
Loading chart...

Convergence observed: The Gaussian distribution has finite mean and variance. Sample means converge quickly to μ = 0.

Each colored line represents an independent simulation. As sample size increases (x-axis, log scale), watch how the sample means behave.

For finite-mean distributions, all runs converge to the same value. For infinite-mean Pareto (α ≤ 1), the means keep jumping — a single extreme observation can dramatically shift the average.

Weak vs Strong LLN

There are actually two versions of the Law of Large Numbers:

Definition

Weak Law of Large Numbers

The sample mean converges in probability to the true mean:

Read: The probability that the sample mean differs from mu by more than epsilon goes to zero

Large deviations from the mean become increasingly unlikely

Definition

Strong Law of Large Numbers

The sample mean converges almost surely to the true mean:

Read: With probability 1, the sample mean converges to mu

Convergence happens on essentially every possible sequence of samples

The strong law implies the weak law, but not vice versa. For fat tails with finite mean but infinite variance (), both laws still hold — but the rate of convergence becomes extremely slow.

Rate of Convergence

Even when the mean exists, the speed at which the sample average converges depends on whether higher moments exist.

Key Insight

Convergence Speed Under Fat Tails

For thin-tailed distributions (finite variance), the error in the sample mean shrinks like . But for fat-tailed distributions with infinite variance, convergence is much slower.

With (finite mean, infinite variance):

For , this is instead of — you need about 8 times more data for the same precision!

Distribution TypeConvergence RateSamples for 10x Precision
Gaussian (finite variance)100x
Pareto with 1,000x
Pareto with ~10 billion x

Key Takeaways

  • The Law of Large Numbers guarantees that sample averages converge to the true mean — but only if the mean exists
  • For Pareto distributions with , the mean is infinite and the LLN does not apply
  • When the mean exists but variance is infinite (), the LLN holds but convergence is extremely slow
  • You may need exponentially more data under fat tails to achieve the same precision as under thin tails
  • A single extreme observation can dominate the sample average, even with large