The Problem of Extremes

How the maximum grows with sample size, and why single observations can dominate.

One of the starkest differences between thin-tailed and fat-tailed distributions lies in how the maximum of a sample behaves. Understanding this explains why single observations can dominate statistics in fat-tailed domains.

Extremes in Thin-Tailed Distributions

For thin-tailed distributions like the Gaussian, the maximum of samples grows very slowly:

Read: The maximum of n samples grows roughly like the square root of 2 log n

The largest observation grows very slowly — logarithmically — as you get more data

This logarithmic growth means:

  • Going from 100 to 10,000 observations roughly doubles the expected maximum (from about 3.0 to 4.3 standard deviations)
  • Extremes are bounded and predictable
  • No single observation dominates the sum
Example

Heights in a Classroom

Human heights are approximately Gaussian with mean 170cm and standard deviation 10cm. In a class of 30 students, the tallest is typically around 190cm (2 standard deviations above mean). In a school of 1,000 students, the tallest might be 200cm.

You won't find someone 10 meters tall, no matter how large the population. The maximum is effectively bounded.

Extremes in Fat-Tailed Distributions

For fat-tailed distributions with power law tails, the maximum grows much faster:

Definition

Maximum Growth Rate

For a Pareto distribution with tail exponent :

For (common for wealth distributions):

  • With : max ~ times the scale parameter
  • With : max ~ times the scale
Key Insight

The Dominance of Extremes

When , the maximum isn't just large — it dominates the entire sum. A single observation can account for more than half of the total.

This is not a bug; it's a feature of fat-tailed distributions. The extreme observations are where the action is.

How Extremes Dominate the Sum

Example

Gaussian vs. Pareto: Contribution of the Maximum

Consider 1,000 samples from each distribution:

Gaussian (thin-tailed):

  • Maximum ~ 3.3 standard deviations from mean
  • Contributes roughly 0.1% to the sum
  • Removing it barely affects the average

Pareto with (fat-tailed):

  • Maximum can be 10-100 times the median
  • Often contributes majority of the sum
  • Removing it dramatically changes everything

This explains why the sample mean is unstable: it's dominated by the largest observation, which itself is highly variable.

Read: The ratio of the maximum to the sum converges to a positive constant

The maximum remains a substantial fraction of the total, even with infinite data

Real-World Implications

Example

Wealth Distribution

The distribution of wealth follows a Pareto distribution with . This means:

  • In a room of 100 random Americans, one person likely holds more than half the total wealth
  • "Average wealth" is a nearly meaningless statistic — it's dominated by the richest person in the sample
  • Adding Bill Gates to a room instantly makes "average wealth" over $1 billion
Example

Insurance Losses

Hurricane losses follow fat-tailed distributions. For an insurance company:

  • Most years have modest claims
  • But one catastrophic event (Katrina, Harvey) can exceed all previous years combined
  • "Average annual loss" computed from historical data is deeply misleading
Key Insight

The Hidden Risk Problem

In fat-tailed domains, your sample may not contain any extreme observations — giving you false confidence about risk. Then one day, the extreme arrives and exceeds everything you've seen.

This is Taleb's "turkey problem": the turkey thinks life is great based on 1,000 days of being fed, then comes Thanksgiving. The extreme wasn't in the historical sample.

The Mathematics of Dominance

Definition

Single Large Observation Dominates

For a subexponential distribution (including all power laws), the sum is asymptotically equivalent to the maximum:

This means: the probability of a large sum is essentially the probability that one observation is large.

Compare this to thin tails, where the sum is driven by the accumulated effect of many moderate observations. The path to a large sum is fundamentally different.

Key Takeaways

  • In thin-tailed distributions, the maximum grows ~ and contributes little to the sum
  • In fat-tailed distributions, the maximum grows ~ and can dominate the entire sum
  • A single observation can represent more than half the total in fat-tailed data
  • Historical data may hide extreme risk simply because the extreme hasn't occurred yet in your sample
  • This dominance by extremes is why standard statistics (which assume many small contributions) fail