Defining Fat Tails

A formal hierarchy from thin-tailed to fat-tailed distributions, and what makes tails "fat".

We've seen examples of distributions with different tail behaviors. Now we'll define precisely what “fat tails” means mathematically and establish a hierarchy from the thinnest to the fattest tails.

The Intuition

A distribution has fat tails if extreme events are much more likely than a Gaussian would predict. But how do we make this precise?

The key is how the survival function decays as we move into the tails. Recall that tells us how much probability mass lies beyond the value x.

Example

Decay Comparison

Consider how P(X > x) behaves for large x in different distributions:

  • Gaussian: decays like (super-exponential)
  • Exponential: decays like (exponential)
  • Pareto: decays like (polynomial)

At x = 10, if we normalize so all start at P(X > 1) = 0.5:

  • Gaussian: ≈ 10^(-22) (essentially zero)
  • Exponential: ≈ 0.00005
  • Pareto (α=2): ≈ 0.01 (still substantial!)

The Tail Hierarchy

Distributions can be classified by their tail behavior, from thinnest to fattest:

1. Thin-tailed (Subgaussian)

Tails thinner than or equal to Gaussian. Bounded distributions like Uniform[0,1].

for some constants c, a > 0

2. Light-tailed (Exponential decay)

Tails decay exponentially. Includes Gaussian, Exponential, Gamma.

for some β > 0, c > 0

3. Fat-tailed (Subexponential)

Tails decay slower than exponential. Pareto, Log-normal, Weibull (β < 1).

where L(x) is slowly varying

4. Super-fat (Infinite mean)

So fat that even the mean doesn't exist. Cauchy, Pareto with α ≤ 1.

Explore: The Tail Hierarchy

Compare survival functions across the hierarchy. Watch how they separate as you move further into the tails.

Survival Functions: P(X > x)

Loading chart...

On log-log scale, power laws appear as straight lines. Steeper slope = thinner tails.

Probability of X > 10

DistributionP(X > 10)Ratio to Gaussian
Gaussian0.00e+01
Exponential6.74e-3Infinity
Pareto α=30.0010Infinity
Pareto α=20.0100Infinity
Pareto α=1.50.0316Infinity

Fat-tailed distributions assign vastly more probability to extreme events. A “10-sigma” event under Gaussian is ~10^22 times more likely under Pareto.

Formal Definitions

Definition

Heavy-tailed Distribution

A distribution is heavy-tailed (or fat-tailed) if its moment generating function is infinite for all t > 0:

Read: The expected value of e to the tX is infinite for all positive t

No exponential moment exists — the tails are too heavy

Equivalently, for any λ > 0:

This means the tail decays slower than any exponential function.

Definition

Regularly Varying Tails

A distribution F has regularly varying tails with index -α if:

where L(x) is a slowly varying function, meaning:

Example

Slowly Varying Functions

Examples of slowly varying functions:

  • Constants: L(x) = c
  • Logarithms: L(x) = log(x), (log(x))^β
  • Iterated logs: L(x) = log(log(x))

Slowly varying functions change so slowly compared to power functions that they don't affect the basic power law character.

Why Classification Matters

Different tail classes have fundamentally different statistical properties:

PropertyLight-tailedFat-tailed
MomentsAll existMay not exist
Sample meanConverges quicklyUnstable
CLTStandard √n rateMay fail or be slow
ExtremesSmall contributionCan dominate
Historical dataReliable guideMisleading
Key Insight

Taleb's Core Message

Most of our statistical tools assume light tails. When applied to fat-tailed domains, they give dangerously wrong answers. The first step is correctly identifying which world you're in.

Key Takeaways

  • Fat tails mean extreme events are more likely than exponential decay would suggest
  • The hierarchy: Subgaussian → Light-tailed → Fat-tailed → Infinite mean
  • Fat-tailed = moment generating function doesn't exist
  • Regularly varying distributions have tails like
  • Different tail classes require fundamentally different statistical approaches