The Gaussian Distribution

The famous bell curve — its properties, why it's so common, and why Taleb warns against overreliance on it.

The Gaussian distribution — also called the normal distribution or bell curve — is the most famous distribution in statistics. It appears everywhere, from measurement errors to heights to IQ scores. But as Taleb emphasizes, its very familiarity can be dangerous when applied to domains where it doesn't belong.

The Bell Curve Function

Before we look at the full Gaussian formula, let's understand the mathematical function that creates the famous “bell” shape: .

Definition

The Bell Curve Shape

The Gaussian's shape comes from the function:

This is the exponential function with a negative squared argument. The squaring makes it symmetric (same shape left and right), and the negative sign makes it decay away from x = 0.

Read: e to the minus x squared

The function that creates the bell shape — peaks at x = 0 and drops rapidly on both sides

What makes special is how fast it decays. The x² in the exponent means the decay rate accelerates as you move away from zero — this is what creates the Gaussian's “thin tails.”

Explore: The Bell Curve

See how e−x² compares to other decay functions. Notice how quickly it drops to zero — this is why extreme events are so rare under Gaussian assumptions.

The Bell Curve: y = e−x²

The Gaussian's shape comes from e−x² — the exponential of negative x squared. This creates the famous “bell” shape with its distinctive properties.

Loading chart...

Notice the differences:

  • e−x² (black): Drops fastest — the “thin tails”
  • e−|x| (green): Drops slower — the boundary between thin/fat
  • 1/(1+x²) (red): Drops slowest — “fat tails” (Cauchy-like)

Why e−x² Creates Thin Tails

x = 1
0.3679
e1² = e1
x = 2
0.0183
e2² = e4
x = 3
0.0001
e3² = e9
x = 4
1.1e-7
e4² = e16

The x² in the exponent makes the decay accelerate as you move away from zero. At x = 3, you're computing e−9 ≈ 0.0001. At x = 4, it's e−16 ≈ 0.00000001.

Tail Comparison at x = 4

FunctionValue at x=4Tail Type
e−x²1.13e-7Extremely thin (Gaussian)
e−|x|1.83e-2Thin (Exponential)
1/(1+x²)0.0588Fat (Cauchy-like)

The Gaussian (e−x²) is 522712× smaller than the Cauchy-like function at x = 4. This ratio grows rapidly as x increases.

The squared exponent is everything:

  • e−x decays exponentially (each unit of x divides by e ≈ 2.7)
  • e−x² decays super-exponentially (the rate itself accelerates)
  • This is why Gaussian tails are called “thin” — they vanish incredibly fast
  • 5-sigma events become “impossible” under Gaussian assumptions
Key Insight

Why This Creates Thin Tails

The squared exponent is everything. While e−x decays at a constant rate (dividing by e ≈ 2.7 for each unit of x), e−x² decays at an acceleratingrate. At x = 4, you're computing e−16 — an incredibly small number. This is why 5-sigma events are “impossible” under Gaussian assumptions.

The Gaussian PDF

Definition

Gaussian Distribution

A random variable follows a Gaussian (or normal) distribution with mean μ and standard deviation σ if its probability density function is:

We write .

The Gaussian is completely determined by two parameters:

  • (mu) — the mean, median, and mode (all equal due to symmetry)
  • (sigma) — the standard deviation, controlling the spread

Key Properties

All Moments Exist

Every moment of the Gaussian distribution is finite. The mean is , the variance is , the skewness is 0 (symmetric), and the kurtosis is exactly 3.

Stability Under Addition

If and are independent, then:

The sum of Gaussians is Gaussian. This property (called “stability”) is rare and powerful — most distributions don't have it.

Maximum Entropy

Among all distributions with a given mean and variance, the Gaussian has maximum entropy. It's the “least informative” distribution given those constraints — which is why it often appears when we know little about a system.

The 68-95-99.7 Rule

For a Gaussian distribution, nearly all probability mass is concentrated within a few standard deviations of the mean:

68.3% of values fall within of

95.4% of values fall within of

99.7% of values fall within of

99.99994% of values fall within of

Key Insight

The Rarity of Extreme Events

Under a Gaussian distribution, a “5-sigma event” should occur roughly once in 3.5 million observations. A 6-sigma event is 1 in a billion.

Taleb's critique: In financial markets, 5+ sigma events happen far more frequently than this. The 1987 crash was a 20+ sigma event under Gaussian assumptions — essentially impossible. This suggests markets are not Gaussian.

Explore: The Gaussian Distribution

Adjust the parameters to see how μ shifts the center and σ controls the spread. Watch how the 68-95-99.7 bands change as you modify σ.

μmean
0.00
σstd dev
1.00
Loading chart...
74.1%
within ±1σ
96.5%
within ±2σ
99.8%
within ±3σ

Extreme Event Probabilities (beyond ±nσ)

P(|X - μ| > 4σ) = 4.61e-5 ≈ 1 in 21,698

P(|X - μ| > 5σ) = 4.15e-7 ≈ 1 in 2,407,683

P(|X - μ| > 6σ) = 1.43e-9 ≈ 1 in 699,098,732

Under Gaussian assumptions, 5σ events are vanishingly rare. In real markets, they happen regularly.

Tail Behavior: Light Tails

The survival function of the Gaussian decays extremely fast:

Read: P of X greater than x is approximately 1 over x root 2 pi times e to the minus x squared over 2

The tail probability shrinks faster than any polynomial — exponentially fast in x²

This decay is extremely rapid. Compare:

  • At : (3 in 100,000)
  • At : (3 in 10 million)
  • At : (1 in a billion)

Visualizing Tail Decay

The log scale on the y-axis reveals how rapidly Gaussian tails vanish. The table shows the exact probability and “1 in N” interpretation for key σ thresholds.

Loading chart...
ThresholdP(X > x)Frequency
9.95e-41 in 1.0 thousand
2.30e-51 in 43.4 thousand
2.08e-71 in 4.8 million
7.2e-101 in 1.4 billion

Key insight: The Gaussian tail decays as e-x²/2, which is faster than exponential. Each additional σ reduces the tail probability by roughly 100-1000×. This makes events beyond 5-6σ essentially “impossible” under Gaussian assumptions.

The problem: In fat-tailed domains like financial markets, events that “should” be 1-in-a-billion occur far more frequently, revealing the Gaussian assumption to be dangerously wrong.

Each additional σ reduces the tail probability by roughly a factor of 1000 or more. This is what makes extreme events “impossible” under Gaussian assumptions — and why using Gaussian models for fat-tailed phenomena leads to catastrophic underestimation of risk.

Where the Gaussian Applies (and Doesn't)

Example

Good Applications

  • Measurement errors — instrumental noise often follows Gaussian
  • Heights of people — bounded by biology, symmetric around mean
  • IQ scores — by construction (they're normalized to be Gaussian)
  • Sums of many small effects — Central Limit Theorem applies
Example

Dangerous Misapplications

  • Financial returns — have fat tails, not Gaussian
  • Insurance claims — dominated by rare large events
  • Earthquake magnitudes — follow power laws
  • Wealth distribution — extremely skewed, unbounded
  • Pandemic sizes — fat-tailed with potential for extreme events
Key Insight

Taleb's Warning

The problem isn't that the Gaussian is wrong — it's that we apply it where it doesn't belong. In domains where single observations can dominate (wealth, returns, casualties), Gaussian assumptions lead to gross underestimation of tail risk.

Key Takeaways

  • The Gaussian has PDF , with tails that decay extremely fast
  • All moments exist; skewness = 0, kurtosis = 3
  • The 68-95-99.7 rule shows how concentrated the distribution is
  • Sums of Gaussians remain Gaussian (stability)
  • Extreme events are treated as “impossible” — a dangerous assumption in fat-tailed domains
  • Taleb's key insight: real-world extremes often vastly exceed Gaussian predictions