Problem Set 2: Fat Tail Behavior
Problems on sample mean instability and maximum dominance.
This problem set explores the distinctive behaviors that emerge under fat tails: sample mean instability and maximum dominance. These phenomena fundamentally challenge our intuitions from Gaussian statistics.
These problems are best explored computationally. Try implementing them in Python or R to see the effects firsthand.
Problem 4: Sample Mean Instability
Problem Statement
Generate 1000 samples from . Compute the sample mean. Repeat this process 100 times. What do you observe about the variability of the sample means?
Implementation Guide
- To generate Pareto samples: if , then is Pareto distributed
- Store the 100 sample means and examine their distribution
- Compare the spread of these means to what you would expect from Gaussian samples
Questions to Consider
- Do the sample means cluster around a stable value?
- What is the range (max - min) of your 100 sample means?
- Does increasing n from 1000 to 10000 help stabilize the means?
Why Standard Errors Fail
For , the variance of the Pareto distribution is infinite. The standard error formula is meaningless because . No matter how large your sample, the sample mean remains unstable.
What You Should Observe
The sample means will vary wildly — some runs will produce moderate values, while others will be dominated by a single extreme observation. This is not a bug; it is a feature of fat tails. The Law of Large Numbers converges much more slowly (if at all) for infinite-variance distributions.
Problem 5: Maximum Dominance
Problem Statement
For samples from , what fraction of the sum comes from the largest observation on average?
Implementation Guide
- Generate 100 Pareto samples
- Compute the sum
- Find the maximum
- Calculate the ratio
- Repeat many times and compute the average ratio
Maximum Dominance — A Fat Tail Signature
In Gaussian world, the largest observation contributes roughly of the sum for . In fat-tailed distributions, the largest observation can contribute 30-60% or more! This is the defining feature of subexponential distributions.
Theoretical Background
For subexponential distributions (including Pareto), there is a remarkable result:
This means the probability of a large sum is dominated by the probability of a single large observation. The sum is large because one term is large, not because many moderate terms accumulated.
Real-World Implications
This explains why single events can dominate entire portfolios, insurance pools, or historical records. In wealth distribution, a single billionaire can have more than the bottom 50% combined. In market crashes, a single day can account for most of a decade's losses.
What You Should Learn
- Sample means do not stabilize for distributions with infinite variance
- Standard error formulas are meaningless when variance is infinite
- In fat-tailed distributions, the maximum dominates the sum
- These are not bugs or anomalies — they are fundamental mathematical properties
- Understanding these behaviors is essential for risk management in Extremistan