When CLT Fails
What happens with infinite variance — stable limits and slower convergence.
When the variance is infinite, the Central Limit Theorem breaks down. But sums of random variables still converge to something — just not a Gaussian. Instead, they converge to a broader class called stable distributions.
This is perhaps the most important technical insight for understanding fat tails: there's a whole world beyond the Gaussian, with very different properties.
Generalized Central Limit Theorem
Generalized CLT
If are i.i.d. with tails decaying as for , then properly normalized sums converge to an α-stable distribution:
Read: “The normalized sum converges in distribution to an alpha-stable distribution”
Sums of fat-tailed variables converge to a stable distribution with the same tail exponent
The key difference from the standard CLT:
- The limit is not Gaussian — it's a stable distribution with the same tail exponent
- The normalizing constants and scale differently: instead of
The Family of Stable Distributions
Stable Distribution
A distribution is stable if a linear combination of independent copies has the same distribution (up to location and scale). Stable distributions are characterized by four parameters:
- : stability index (tail exponent)
- : skewness parameter
- : scale parameter
- : location parameter
The parameter determines the tail behavior:
| α Value | Distribution | Properties |
|---|---|---|
| Gaussian | Finite all moments, thin tails | |
| , | Cauchy | No mean, very fat tails |
| , | Levy | Extremely fat one-sided tail |
| General stable | Tails decay as |
The Gaussian is the Exception
The Gaussian () is the only stable distribution with finite variance. All other stable distributions () have infinite variance and power-law tails.
This is Taleb's central point: the Gaussian is not the "default" distribution — it's a special case that requires finite variance. When variance is infinite, different mathematics apply.
Visualizing Stable Convergence
The simulation below demonstrates the generalized CLT: sums of Pareto random variables (with infinite variance) converge to a stable distribution, not a Gaussian. Adjust the tail exponent α and sample size n to see how convergence differs from the classical CLT.
Convergence to Stable Distribution: With α = 1.5 < 2 (infinite variance), the scaled sums do NOT converge to a Gaussian (green dotted line). Instead, they converge to an α-stable distribution (orange dashed line) with heavier tails. The Gaussian underestimates extreme values.
| Property | CLT (α = 2) | Generalized CLT (α < 2) |
|---|---|---|
| Limit distribution | Gaussian | Stable (α) |
| Normalization | √n | n1/α |
| Tail behavior | Exponential decay | Power law (x-α) |
| Convergence speed | Fast (n-½) | Slow (n1/α - 1) |
The histogram shows 2,000 simulations of scaled sums. For distributions with infinite variance (1 < α < 2), the generalized CLT tells us sums converge to a stable distribution, not a Gaussian.
Notice how the histogram has heavier tails than the Gaussian prediction. This is why using Gaussian-based statistics on fat-tailed data leads to systematic underestimation of extreme events.
Slower Convergence
When the CLT fails and we get convergence to a stable distribution instead, the rate of convergence is much slower:
Compare this to the Gaussian case where error goes as :
| α | Convergence Rate | n needed for 1% error |
|---|---|---|
| 2 (Gaussian) | ~10,000 | |
| 1.8 | ~30,000 | |
| 1.5 | ~1,000,000 | |
| 1.2 | ~10 billion |
Practical Implications
Suppose you're estimating the average of a Pareto distribution with (finite mean, infinite variance).
- With Gaussian data: 100 samples give reasonable estimates
- With : you need ~100,000 samples for comparable accuracy
- Even then, a single extreme observation can throw off your estimate
In practice, you rarely have 100,000 samples of financial crises or pandemic events.
The Infinite Mean Case
When , even the mean is infinite. In this regime:
Complete Breakdown
For , the scaled sum does not converge to any useful limit in the traditional sense. The sample mean wanders without bound, dominated by increasingly extreme observations.
This isn't "slow convergence" — it's no convergence at all. Statistical inference based on sample averages becomes meaningless.
The Cauchy Distribution
The Cauchy distribution () is notorious in statistics:
A remarkable property: the sample mean of n Cauchy random variables has exactly the same distribution as a single observation!
Averaging provides no improvement whatsoever. Taking 1 million samples gives you no more information than taking 1.
Why This Matters
The failure of the CLT has profound implications:
- Standard statistics don't apply: t-tests, ANOVA, regression — all assume approximate normality of averages
- Risk models underestimate extremes: Gaussian-based VaR and expected shortfall are misleading
- Diversification is less powerful: Portfolio theory assumes variance exists and the CLT applies
- Historical averages are unreliable: Past performance truly doesn't predict future results
Taleb's Methodology
This is why Taleb advocates for:
- Assuming fat tails until proven otherwise
- Focusing on robustness to extremes rather than optimizing for averages
- Using bounded payoffs and convex strategies
- Being skeptical of any analysis that relies on sample statistics converging
Key Takeaways
- When variance is infinite (), sums converge to stable distributions, not Gaussians
- Stable distributions form a family parameterized by ; the Gaussian is the special case
- Convergence rate slows dramatically: instead of
- For (infinite mean), sample averages don't converge to anything useful
- Most standard statistical methods implicitly assume the CLT — they fail under fat tails