Convergence Types
Almost sure, in probability, and in distribution — and how they relate.
When we say "the sample mean converges to the true mean," what exactly do we mean by "converges"? There are several different notions of convergence in probability theory, each with different implications. Understanding these distinctions is crucial for knowing when the Law of Large Numbers applies.
Almost Sure Convergence
The strongest form of convergence: the sequence converges for almost every possible outcome.
Almost Sure Convergence
A sequence of random variables converges almost surely (a.s.) to if:
We write or almost surely.
Read: “The probability that X_n converges to X as n goes to infinity equals 1”
For (almost) every possible outcome of the random experiment, the sequence actually converges in the ordinary calculus sense
"Almost surely" means "with probability 1" — the set of outcomes where convergence fails has probability zero. It's like saying the sequence converges "except possibly on a negligible set."
Strong Law of Large Numbers
If are i.i.d. with finite mean , then the sample mean converges almost surely:
This is the Strong Law of Large Numbers — for almost every possible sequence of outcomes, the sample mean will converge to .
When the Strong Law Fails
The Strong Law requires . For fat-tailed distributions where the mean is infinite (like Pareto with ), the sample mean doesn't converge to any finite limit — it just keeps growing.
Convergence in Probability
A weaker notion: we only require that large deviations become increasingly unlikely.
Convergence in Probability
A sequence converges in probability to if for every :
We write .
This is weaker than almost sure convergence because it doesn't require the sequence to converge for any particular outcome — only that the probability of large deviations vanishes.
Weak Law of Large Numbers
The Weak Law of Large Numbers states that if are i.i.d. with finite mean , then:
This requires only that the variance be finite (or under weaker conditions, just that the mean exists). It's less demanding than the Strong Law.
Convergence in Distribution
The weakest form: only the distribution (CDF) converges, not the random variables themselves.
Convergence in Distribution
A sequence converges in distribution (or weakly) to if:
at every point where is continuous. We write .
Read: “The CDF of X_n converges to the CDF of X at all continuity points”
The probability distribution of X_n approaches the distribution of X
Importantly, convergence in distribution doesn't mean the random variables themselves are getting close — only their distributions are. The variables could even be defined on different probability spaces.
Central Limit Theorem
The Central Limit Theorem is a statement about convergence in distribution. If are i.i.d. with mean and variance :
The standardized sample mean converges in distribution to a standard normal, regardless of the original distribution (as long as variance exists).
CLT and Fat Tails
The Central Limit Theorem requires finite variance. For distributions with infinite variance (like Pareto with ), the CLT does not apply. The standardized sum converges to a stable distribution instead — which may have heavy tails itself.
The Convergence Hierarchy
These convergence types form a hierarchy:
Read: “Almost sure convergence implies convergence in probability, which implies convergence in distribution”
Stronger convergence always implies weaker convergence, but not vice versa
| Type | Strength | What Converges | Key Theorem |
|---|---|---|---|
| Almost sure | Strongest | Individual sequences (a.e.) | Strong LLN |
| In probability | Medium | Probability of deviation | Weak LLN |
| In distribution | Weakest | CDFs / distributions | CLT |
Convergence That's Not Almost Sure
Let and define:
Then (since ), but it's not almost sure convergence — there's always some small probability of .
Why This Matters for Fat Tails
The type of convergence determines what guarantees we have about statistical estimates:
- Finite mean, finite variance: The strong LLN and CLT apply. Sample means converge a.s. to the true mean, and confidence intervals are valid.
- Finite mean, infinite variance: The strong LLN still applies (sample mean converges), but the CLT fails. Confidence intervals based on normal approximations are unreliable.
- Infinite mean: No LLN applies. Sample means don't converge to anything — they keep growing or fluctuating wildly.
The Practical Message
If you're working with data from a fat-tailed distribution with , standard statistical tools that assume CLT convergence will give misleading results. The sample mean might look stable for a while, but a single extreme observation can dramatically change it — because the theoretical limit doesn't exist or isn't what the CLT predicts.
Key Takeaways
- Almost sure convergence: — the sequence converges for almost every outcome
- Convergence in probability: — large deviations become unlikely
- Convergence in distribution: — only the CDFs converge
- Hierarchy: a.s. ⟹ probability ⟹ distribution
- The Strong LLN requires finite mean; the CLT requires finite variance
- For fat-tailed distributions with , the CLT fails — convergence is to a stable distribution, not normal