Convergence Types

Almost sure, in probability, and in distribution — and how they relate.

When we say "the sample mean converges to the true mean," what exactly do we mean by "converges"? There are several different notions of convergence in probability theory, each with different implications. Understanding these distinctions is crucial for knowing when the Law of Large Numbers applies.

Almost Sure Convergence

The strongest form of convergence: the sequence converges for almost every possible outcome.

Definition

Almost Sure Convergence

A sequence of random variables converges almost surely (a.s.) to if:

We write or almost surely.

Read: “The probability that X_n converges to X as n goes to infinity equals 1”

For (almost) every possible outcome of the random experiment, the sequence actually converges in the ordinary calculus sense

"Almost surely" means "with probability 1" — the set of outcomes where convergence fails has probability zero. It's like saying the sequence converges "except possibly on a negligible set."

Example

Strong Law of Large Numbers

If are i.i.d. with finite mean , then the sample mean converges almost surely:

This is the Strong Law of Large Numbers — for almost every possible sequence of outcomes, the sample mean will converge to .

Key Insight

When the Strong Law Fails

The Strong Law requires . For fat-tailed distributions where the mean is infinite (like Pareto with ), the sample mean doesn't converge to any finite limit — it just keeps growing.

Convergence in Probability

A weaker notion: we only require that large deviations become increasingly unlikely.

Definition

Convergence in Probability

A sequence converges in probability to if for every :

We write .

This is weaker than almost sure convergence because it doesn't require the sequence to converge for any particular outcome — only that the probability of large deviations vanishes.

Example

Weak Law of Large Numbers

The Weak Law of Large Numbers states that if are i.i.d. with finite mean , then:

This requires only that the variance be finite (or under weaker conditions, just that the mean exists). It's less demanding than the Strong Law.

Convergence in Distribution

The weakest form: only the distribution (CDF) converges, not the random variables themselves.

Definition

Convergence in Distribution

A sequence converges in distribution (or weakly) to if:

at every point where is continuous. We write .

Read: “The CDF of X_n converges to the CDF of X at all continuity points”

The probability distribution of X_n approaches the distribution of X

Importantly, convergence in distribution doesn't mean the random variables themselves are getting close — only their distributions are. The variables could even be defined on different probability spaces.

Example

Central Limit Theorem

The Central Limit Theorem is a statement about convergence in distribution. If are i.i.d. with mean and variance :

The standardized sample mean converges in distribution to a standard normal, regardless of the original distribution (as long as variance exists).

Key Insight

CLT and Fat Tails

The Central Limit Theorem requires finite variance. For distributions with infinite variance (like Pareto with ), the CLT does not apply. The standardized sum converges to a stable distribution instead — which may have heavy tails itself.

The Convergence Hierarchy

These convergence types form a hierarchy:

Read: “Almost sure convergence implies convergence in probability, which implies convergence in distribution”

Stronger convergence always implies weaker convergence, but not vice versa

Type	Strength	What Converges	Key Theorem
Almost sure	Strongest	Individual sequences (a.e.)	Strong LLN
In probability	Medium	Probability of deviation	Weak LLN
In distribution	Weakest	CDFs / distributions	CLT

Example

Convergence That's Not Almost Sure

Let and define:

Then (since ), but it's not almost sure convergence — there's always some small probability of .

Why This Matters for Fat Tails

The type of convergence determines what guarantees we have about statistical estimates:

Finite mean, finite variance: The strong LLN and CLT apply. Sample means converge a.s. to the true mean, and confidence intervals are valid.
Finite mean, infinite variance: The strong LLN still applies (sample mean converges), but the CLT fails. Confidence intervals based on normal approximations are unreliable.
Infinite mean: No LLN applies. Sample means don't converge to anything — they keep growing or fluctuating wildly.

Key Insight

The Practical Message

If you're working with data from a fat-tailed distribution with , standard statistical tools that assume CLT convergence will give misleading results. The sample mean might look stable for a while, but a single extreme observation can dramatically change it — because the theoretical limit doesn't exist or isn't what the CLT predicts.

Key Takeaways

Almost sure convergence: — the sequence converges for almost every outcome
Convergence in probability: — large deviations become unlikely
Convergence in distribution: — only the CDFs converge
Hierarchy: a.s. ⟹ probability ⟹ distribution
The Strong LLN requires finite mean; the CLT requires finite variance
For fat-tailed distributions with , the CLT fails — convergence is to a stable distribution, not normal