The Central Limit Theorem

Why sums tend toward Gaussian, the conditions required, and the rate of convergence.

The Central Limit Theorem (CLT) is perhaps the most celebrated result in probability theory. It explains why the Gaussian distribution appears everywhere: whenever you sum many independent random variables, the result tends to look Gaussian.

This remarkable universality has made the Gaussian the default assumption in statistics — but as we'll see, the CLT has requirements that fat tails violate.

The Central Limit Theorem

Definition

Central Limit Theorem

If are i.i.d. random variables with mean μ and finite variance , then as :

In other words: no matter what distribution you start with (subject to the requirements), the average of many samples will be approximately normally distributed.

Example

Dice Rolling

A single die roll has a uniform distribution on {1, 2, 3, 4, 5, 6} — not at all Gaussian. But roll 100 dice and compute the average:

  • Mean:
  • Standard deviation:
  • Standard error:

The distribution of this average is nearly perfectly Gaussian, centered at 3.5 with standard deviation 0.171.

The Variance Requirement

Finite Variance Required!

The CLT requires . If the variance is infinite, the theorem does not apply, and the sum does not converge to a Gaussian.

For Pareto distributions, recall:

  • : Variance is finite, CLT applies
  • : Mean exists but variance is infinite, CLT does not apply
  • : Even the mean is infinite
Key Insight

The Gaussian Assumption is Often Wrong

Many real-world phenomena have values between 1 and 3: financial returns, earthquake magnitudes, city populations, book sales. For all of these, either the CLT doesn't apply at all, or it applies so slowly as to be practically useless.

Rate of Convergence: Berry-Esseen

The CLT tells us that convergence happens, but how fast? The Berry-Esseen theorem quantifies this:

Definition

Berry-Esseen Theorem

If are i.i.d. with mean , variance , and finite third moment , then:

Read: The maximum difference between the CDF of the standardized mean and the normal CDF is bounded by C rho over sigma cubed root n

The approximation error decreases like 1/√n

The constant is approximately 0.4748. This tells us:

  • Error decreases as
  • 100 samples give ~10% approximation error
  • 10,000 samples give ~1% approximation error

But notice: Berry-Esseen requires a finite third moment. For fat-tailed distributions, the third moment is often infinite, and the convergence rate becomes much slower.

Interactive Demonstration

Explore the Central Limit Theorem by seeing how sums of random variables converge (or fail to converge) to a Gaussian:

Distribution:
nsamples summed
10
Loading chart...

CLT applies: With Uniform(0,1) (finite variance), the standardized sum converges to a standard normal as n increases. Try increasing n to see better convergence.

The histogram shows the distribution of standardized sums from 2,000 simulations. The dashed green curve is the standard normal distribution — what the CLT predicts.

For distributions with finite variance, increasing n makes the histogram match the normal curve. For infinite-variance distributions, the tails remain heavier than Gaussian no matter how large n gets.

Why Does the Gaussian Emerge?

The deep reason the Gaussian appears is related to stability:

Definition

Stability Property

A distribution is stable if a sum of independent copies (suitably scaled) has the same distribution. The Gaussian is stable:

Read: The sum of n independent Gaussians is itself Gaussian with scaled parameters

Adding Gaussians gives you another Gaussian

Key Insight

The Gaussian is Special

Among all distributions with finite variance, the Gaussian is the only stable distribution. This is why sums converge to it — it's the unique "attractor" in the finite-variance world.

But there are other stable distributions with infinite variance. These are the attractors for fat-tailed sums.

Practical Implications

Example

Stock Market Returns

Daily stock returns are often modeled as independent draws from a distribution. If returns were truly Gaussian:

  • A "5-sigma" daily move should occur once every 14,000 years
  • A "10-sigma" move should essentially never happen

In reality, 5-sigma moves happen several times per decade, and 10-sigma moves have occurred multiple times in history (Black Monday 1987, Flash Crash 2010).

This is evidence that market returns have fatter tails than the Gaussian, likely with infinite variance, meaning the CLT convergence is either absent or extremely slow.

The practical consequences are severe:

  • Risk models fail: Value-at-Risk based on Gaussian assumptions dramatically underestimates tail risk
  • Confidence intervals are wrong: Standard confidence intervals assume approximate normality
  • Diversification is less effective: The CLT underpins the idea that diversification reduces risk proportionally to

Key Takeaways

  • The Central Limit Theorem says standardized sums converge to a Gaussian — but only if variance is finite
  • The Berry-Esseen theorem quantifies the convergence rate as , requiring finite third moments
  • The Gaussian is the unique stable attractor for sums of finite-variance random variables
  • For fat-tailed distributions with , the CLT does not apply, and sums do not become Gaussian
  • Real-world data (markets, natural disasters, etc.) often violate the CLT's requirements, making Gaussian-based statistics unreliable