Defining Fat Tails
A formal hierarchy from thin-tailed to fat-tailed distributions, and what makes tails "fat".
We've seen examples of distributions with different tail behaviors. Now we'll define precisely what “fat tails” means mathematically and establish a hierarchy from the thinnest to the fattest tails.
The Intuition
A distribution has fat tails if extreme events are much more likely than a Gaussian would predict. But how do we make this precise?
The key is how the survival function decays as we move into the tails. Recall that tells us how much probability mass lies beyond the value x.
Decay Comparison
Consider how P(X > x) behaves for large x in different distributions:
- Gaussian: decays like (super-exponential)
- Exponential: decays like (exponential)
- Pareto: decays like (polynomial)
At x = 10, if we normalize so all start at P(X > 1) = 0.5:
- Gaussian: ≈ 10^(-22) (essentially zero)
- Exponential: ≈ 0.00005
- Pareto (α=2): ≈ 0.01 (still substantial!)
The Tail Hierarchy
Distributions can be classified by their tail behavior, from thinnest to fattest:
1. Thin-tailed (Subgaussian)
Tails thinner than or equal to Gaussian. Bounded distributions like Uniform[0,1].
for some constants c, a > 0
2. Light-tailed (Exponential decay)
Tails decay exponentially. Includes Gaussian, Exponential, Gamma.
for some β > 0, c > 0
3. Fat-tailed (Subexponential)
Tails decay slower than exponential. Pareto, Log-normal, Weibull (β < 1).
where L(x) is slowly varying
4. Super-fat (Infinite mean)
So fat that even the mean doesn't exist. Cauchy, Pareto with α ≤ 1.
Explore: The Tail Hierarchy
Compare survival functions across the hierarchy. Watch how they separate as you move further into the tails.
Survival Functions: P(X > x)
On log-log scale, power laws appear as straight lines. Steeper slope = thinner tails.
Probability of X > 10
| Distribution | P(X > 10) | Ratio to Gaussian |
|---|---|---|
| Gaussian | 0.00e+0 | 1 |
| Exponential | 6.74e-3 | Infinity |
| Pareto α=3 | 0.0010 | Infinity |
| Pareto α=2 | 0.0100 | Infinity |
| Pareto α=1.5 | 0.0316 | Infinity |
Fat-tailed distributions assign vastly more probability to extreme events. A “10-sigma” event under Gaussian is ~10^22 times more likely under Pareto.
Formal Definitions
Heavy-tailed Distribution
A distribution is heavy-tailed (or fat-tailed) if its moment generating function is infinite for all t > 0:
Read: “The expected value of e to the tX is infinite for all positive t”
No exponential moment exists — the tails are too heavy
Equivalently, for any λ > 0:
This means the tail decays slower than any exponential function.
Regularly Varying Tails
A distribution F has regularly varying tails with index -α if:
where L(x) is a slowly varying function, meaning:
Slowly Varying Functions
Examples of slowly varying functions:
- Constants: L(x) = c
- Logarithms: L(x) = log(x), (log(x))^β
- Iterated logs: L(x) = log(log(x))
Slowly varying functions change so slowly compared to power functions that they don't affect the basic power law character.
Why Classification Matters
Different tail classes have fundamentally different statistical properties:
| Property | Light-tailed | Fat-tailed |
|---|---|---|
| Moments | All exist | May not exist |
| Sample mean | Converges quickly | Unstable |
| CLT | Standard √n rate | May fail or be slow |
| Extremes | Small contribution | Can dominate |
| Historical data | Reliable guide | Misleading |
Taleb's Core Message
Most of our statistical tools assume light tails. When applied to fat-tailed domains, they give dangerously wrong answers. The first step is correctly identifying which world you're in.
Key Takeaways
- Fat tails mean extreme events are more likely than exponential decay would suggest
- The hierarchy: Subgaussian → Light-tailed → Fat-tailed → Infinite mean
- Fat-tailed = moment generating function doesn't exist
- Regularly varying distributions have tails like
- Different tail classes require fundamentally different statistical approaches