When CLT Fails

What happens with infinite variance — stable limits and slower convergence.

When the variance is infinite, the Central Limit Theorem breaks down. But sums of random variables still converge to something — just not a Gaussian. Instead, they converge to a broader class called stable distributions.

This is perhaps the most important technical insight for understanding fat tails: there's a whole world beyond the Gaussian, with very different properties.

Generalized Central Limit Theorem

Definition

Generalized CLT

If are i.i.d. with tails decaying as for , then properly normalized sums converge to an α-stable distribution:

Read: The normalized sum converges in distribution to an alpha-stable distribution

Sums of fat-tailed variables converge to a stable distribution with the same tail exponent

The key difference from the standard CLT:

  • The limit is not Gaussian — it's a stable distribution with the same tail exponent
  • The normalizing constants and scale differently: instead of

The Family of Stable Distributions

Definition

Stable Distribution

A distribution is stable if a linear combination of independent copies has the same distribution (up to location and scale). Stable distributions are characterized by four parameters:

  • : stability index (tail exponent)
  • : skewness parameter
  • : scale parameter
  • : location parameter

The parameter determines the tail behavior:

α ValueDistributionProperties
GaussianFinite all moments, thin tails
, CauchyNo mean, very fat tails
, LevyExtremely fat one-sided tail
General stableTails decay as

Visualizing Stable Convergence

The simulation below demonstrates the generalized CLT: sums of Pareto random variables (with infinite variance) converge to a stable distribution, not a Gaussian. Adjust the tail exponent α and sample size n to see how convergence differs from the classical CLT.

αtail exponent
1.50
nsamples summed
50
Loading chart...

Convergence to Stable Distribution: With α = 1.5 < 2 (infinite variance), the scaled sums do NOT converge to a Gaussian (green dotted line). Instead, they converge to an α-stable distribution (orange dashed line) with heavier tails. The Gaussian underestimates extreme values.

PropertyCLT (α = 2)Generalized CLT (α < 2)
Limit distributionGaussianStable (α)
Normalization√nn1/α
Tail behaviorExponential decayPower law (x)
Convergence speedFast (n)Slow (n1/α - 1)

The histogram shows 2,000 simulations of scaled sums. For distributions with infinite variance (1 < α < 2), the generalized CLT tells us sums converge to a stable distribution, not a Gaussian.

Notice how the histogram has heavier tails than the Gaussian prediction. This is why using Gaussian-based statistics on fat-tailed data leads to systematic underestimation of extreme events.

Slower Convergence

When the CLT fails and we get convergence to a stable distribution instead, the rate of convergence is much slower:

Compare this to the Gaussian case where error goes as :

αConvergence Raten needed for 1% error
2 (Gaussian)~10,000
1.8~30,000
1.5~1,000,000
1.2~10 billion
Example

Practical Implications

Suppose you're estimating the average of a Pareto distribution with (finite mean, infinite variance).

  • With Gaussian data: 100 samples give reasonable estimates
  • With : you need ~100,000 samples for comparable accuracy
  • Even then, a single extreme observation can throw off your estimate

In practice, you rarely have 100,000 samples of financial crises or pandemic events.

The Infinite Mean Case

When , even the mean is infinite. In this regime:

Key Insight

Complete Breakdown

For , the scaled sum does not converge to any useful limit in the traditional sense. The sample mean wanders without bound, dominated by increasingly extreme observations.

This isn't "slow convergence" — it's no convergence at all. Statistical inference based on sample averages becomes meaningless.

Example

The Cauchy Distribution

The Cauchy distribution () is notorious in statistics:

A remarkable property: the sample mean of n Cauchy random variables has exactly the same distribution as a single observation!

Averaging provides no improvement whatsoever. Taking 1 million samples gives you no more information than taking 1.

Why This Matters

The failure of the CLT has profound implications:

  • Standard statistics don't apply: t-tests, ANOVA, regression — all assume approximate normality of averages
  • Risk models underestimate extremes: Gaussian-based VaR and expected shortfall are misleading
  • Diversification is less powerful: Portfolio theory assumes variance exists and the CLT applies
  • Historical averages are unreliable: Past performance truly doesn't predict future results
Key Insight

Taleb's Methodology

This is why Taleb advocates for:

  • Assuming fat tails until proven otherwise
  • Focusing on robustness to extremes rather than optimizing for averages
  • Using bounded payoffs and convex strategies
  • Being skeptical of any analysis that relies on sample statistics converging

Key Takeaways

  • When variance is infinite (), sums converge to stable distributions, not Gaussians
  • Stable distributions form a family parameterized by ; the Gaussian is the special case
  • Convergence rate slows dramatically: instead of
  • For (infinite mean), sample averages don't converge to anything useful
  • Most standard statistical methods implicitly assume the CLT — they fail under fat tails