Mathematical Notation Glossary
A reference guide for symbols and notation used throughout the course. New to mathematical notation? Start with the Greek letters and common patterns.
Showing 43 of 43 entries
mu
(“mew”)greekThe Greek letter mu, commonly used to represent the population mean or expected value of a distribution.
μ = E[X]X ~ N(μ, σ²)sigma
(“sig-ma”)greekThe Greek letter sigma, used for standard deviation (σ) or variance (σ²). Measures the spread of a distribution around its mean.
σ = √Var(X)σ² = E[(X - μ)²]capital sigma
(“sig-ma”)greekCapital sigma indicates summation - adding up a series of terms. The subscript and superscript specify the range.
Σᵢ₌₁ⁿ xᵢ means x₁ + x₂ + ... + xₙalpha
(“al-fa”)greekThe Greek letter alpha, often used for the tail exponent in power law distributions, or as a general parameter.
P(X > x) ~ x^(-α)α-stable distributionbeta
(“bay-ta”)greekThe Greek letter beta, commonly used as a parameter in distributions or regression coefficients.
Beta(α, β) distributiongamma
(“gam-a”)greekThe Greek letter gamma, often used to represent skewness (third standardized moment) or as a parameter.
γ = E[(X-μ)³]/σ³kappa
(“kap-a”)greekThe Greek letter kappa, sometimes used for kurtosis (fourth standardized moment).
κ = E[(X-μ)⁴]/σ⁴capital omega
(“oh-may-ga”)greekCapital omega represents the sample space - the set of all possible outcomes of a random experiment.
ω ∈ ΩX: Ω → ℝdelta
(“del-ta”)greekThe Greek letter delta, commonly used to represent a small change or difference in a quantity.
δx = x₂ - x₁δ-functioncapital gamma
(“gam-a”)greekCapital gamma denotes the gamma function Γ(n), which extends factorial to non-integers: Γ(n) = (n-1)! for positive integers.
Γ(n+1) = n!Γ(1/2) = √πxi
(“zy or ksee”)greekThe Greek letter xi, commonly used as the shape parameter in Extreme Value Theory (GEV distribution). Determines tail behavior.
ξ > 0: Fréchet (fat tails)ξ = 0: Gumbelξ < 0: Weibullnu
(“new”)greekThe Greek letter nu, commonly used for degrees of freedom in t-distributions and chi-squared distributions.
t(ν) distributionχ²(ν) distributionlambda
(“lam-da”)greekThe Greek letter lambda, often used as a rate parameter in Poisson or exponential distributions.
Poisson(λ)Exponential(λ)pi
(“pie”)greekThe mathematical constant pi, the ratio of a circle's circumference to its diameter. Appears frequently in probability density functions.
Normal PDF: (1/√(2π))e^(-x²/2)infinity
(“infinity”)generalThe infinity symbol represents an unbounded quantity. Often appears in integration limits.
∫₋∞^∞ f(x)dx = 1lim_{n→∞}element of
setIndicates membership in a set. "x ∈ A" means "x is an element of set A" or "x is in A".
x ∈ ℝ means x is a real numberfor all
setUniversal quantifier. "∀x" means "for all x" or "for every x".
∀x > 0, P(X > x) ≥ 0there exists
setExistential quantifier. "∃x" means "there exists an x" or "for some x".
∃x such that f(x) = 0empty set
setThe empty set, a set containing no elements. Also written as {}.
P(∅) = 0subset
setIndicates that one set is contained within another. "A ⊂ B" means "A is a subset of B".
ℕ ⊂ ℤ ⊂ ℚ ⊂ ℝunion
setSet union. "A ∪ B" is the set of elements in A or B (or both).
P(A ∪ B) = P(A) + P(B) - P(A ∩ B)intersection
setSet intersection. "A ∩ B" is the set of elements in both A and B.
P(A ∩ B) = P(A)P(B) if independentreal numbers
setThe set of all real numbers, including rational and irrational numbers. Continuous random variables typically take values in ℝ or subsets of ℝ.
X: Ω → ℝintegral
calculusThe integral sign indicates integration - a continuous analog of summation. Used extensively for computing probabilities and expectations.
∫₀¹ f(x)dxE[X] = ∫xf(x)dxlimit
calculusThe limit of a function or sequence. "lim_{x→a} f(x)" is the value f(x) approaches as x gets closer to a.
lim_{n→∞} (1 + 1/n)ⁿ = epartial
calculusIndicates a partial derivative - the derivative with respect to one variable while holding others constant.
∂f/∂xprobability
probabilityThe probability function. P(A) gives the probability of event A occurring.
P(X > 0)P(A | B)expectation
probabilityThe expectation operator. E[X] is the expected value (mean) of random variable X.
E[X] = ∫xf(x)dxE[X²]variance
probabilityThe variance of a random variable, measuring its spread around the mean. Var(X) = E[(X - μ)²].
Var(X) = E[X²] - (E[X])²distributed as
probabilityIndicates the distribution of a random variable. "X ~ N(0,1)" means "X is distributed as standard normal".
X ~ N(μ, σ²)X ~ Exp(λ)The probability density function (for continuous distributions). Describes the relative likelihood of different values.
f(x) = (1/√(2π))e^(-x²/2)CDF
probabilityThe cumulative distribution function. F(x) = P(X ≤ x) gives the probability of being at or below x.
F(x) = ∫₋∞ˣ f(t)dtSurvival function
probabilityThe survival function (also called tail function). S(x) = P(X > x) = 1 - F(x) gives the probability of exceeding x.
S(x) = 1 - F(x)S(x) = P(X > x)covariance
probabilityThe covariance measures the joint variability of two random variables. Cov(X,Y) = E[(X - μX)(Y - μY)].
Cov(X,Y) = E[XY] - E[X]E[Y]Cov(X,X) = Var(X)i.i.d.
probabilityIndependent and identically distributed: random variables that are mutually independent and share the same probability distribution.
X₁, X₂, ..., Xₙ i.i.d. ~ N(0,1)almost surely
probabilityAlmost surely (or almost certain) means an event happens with probability 1. Used in convergence theorems.
Xₙ → X a.s.P(event) = 1independence
probabilityThe independence symbol indicates that random variables or events are statistically independent.
X ⊥ YA ⊥ B means P(A∩B) = P(A)P(B)asymptotic equivalence
(“is asymptotically equivalent to”)calculusf(x) ~ g(x) means lim f(x)/g(x) = 1 as x → ∞. The functions become indistinguishable in ratio for large x. Used extensively to describe tail behavior.
P(X > x) ~ x^{-α}n! ~ √(2πn)(n/e)^n (Stirling)big-O
(“big-O of”)calculusf(x) = O(g(x)) means |f(x)| ≤ C|g(x)| for some constant C and sufficiently large x. It provides an upper bound on the growth rate.
x² + x = O(x²)sin(x) = O(1)little-o
(“little-o of”)calculusf(x) = o(g(x)) means f(x)/g(x) → 0 as x → ∞. The function f becomes negligible compared to g. Stronger than big-O.
x = o(x²)ln(x) = o(x^ε) for any ε > 0slowly varying function
probabilityA function L(x) is slowly varying if L(tx)/L(x) → 1 for all t > 0. Examples include constants, ln(x), and (ln x)^k. Used in fat tail theory to capture deviations from pure power laws.
S(x) = x^{-α}L(x)ln(x) is slowly varyingregularly varying
probabilityA function f(x) is regularly varying with index α if f(tx)/f(x) → t^α for all t > 0. Can be written as f(x) = x^α L(x) where L is slowly varying. Characterizes fat-tailed distributions.
S(x) = x^{-2}ln(x) is RV_{-2}Pareto survival is RV_{-α}domain of attraction
probabilityA distribution F is in the domain of attraction of G (written F ∈ D(G)) if the maximum of n i.i.d. samples from F, properly normalized, converges to G. Determines which extreme value type applies.
Pareto ∈ D(Fréchet)Gaussian ∈ D(Gumbel)Common Notation Patterns
These patterns show how to read common mathematical expressions. Practice reading them aloud to build fluency.
Probability Notation
P(X > x)Read: “The probability that X is greater than x”
What fraction of the time does X exceed the value x?
P(a ≤ X ≤ b)Read: “The probability that X is between a and b”
What fraction of the time is X in the range from a to b?
P(A | B)Read: “The probability of A given B”
If we know B happened, what's the probability of A?
E[X]Read: “The expected value of X”
The average value of X over many observations
E[g(X)]Read: “The expected value of g of X”
The average value of the function g applied to X
Var(X)Read: “The variance of X”
How spread out is X around its mean?
X ~ N(μ, σ²)Read: “X is normally distributed with mean μ and variance σ²”
X follows the famous bell curve centered at μ
X ~ Exponential(λ)Read: “X is exponentially distributed with rate λ”
X follows the exponential distribution, modeling waiting times
X ~ Pareto(α, xₘ)Read: “X is Pareto distributed with tail exponent α and scale xₘ”
X follows a power law distribution with heavy tails
P(A ∩ B)Read: “The probability of A and B both occurring”
Probability that both events A and B happen
P(A ∪ B)Read: “The probability of A or B occurring”
Probability that at least one of A or B happens
Cov(X,Y)Read: “The covariance of X and Y”
How much X and Y vary together
F ∈ D(G)Read: “F is in the domain of attraction of G”
Maxima from F converge to the extreme value distribution G
S(x) = x^{-α}L(x)Read: “S of x equals x to the negative alpha times L of x”
The survival function is a power law with slowly varying correction L(x)
Function Notation
f(x)Read: “f of x”
The output of function f when given input x
f: A → BRead: “f maps from A to B”
Function f takes inputs from set A and produces outputs in set B
lim_{x→∞} f(x)Read: “The limit of f(x) as x approaches infinity”
What value does f(x) approach as x gets arbitrarily large?
lim_{x→a} f(x)Read: “The limit of f(x) as x approaches a”
What value does f(x) approach as x gets closer and closer to a?
n → ∞Read: “n approaches infinity”
n grows without bound
f(x) ~ g(x)Read: “f is asymptotically equivalent to g”
The ratio f(x)/g(x) approaches 1 as x → ∞
f(x) = O(g(x))Read: “f is big-O of g”
f is bounded above by a constant times g for large x
f(x) = o(g(x))Read: “f is little-o of g”
f becomes negligible compared to g as x → ∞
Integration
∫ f(x) dxRead: “The integral of f(x) with respect to x”
Add up all the infinitesimally small pieces of f(x)
∫ₐᵇ f(x) dxRead: “The integral of f(x) from a to b”
Add up f(x) values for all x between a and b
∫₋∞^∞ f(x) dxRead: “The integral of f(x) from negative infinity to positive infinity”
Add up f(x) over all possible values of x
Set Notation
{x | P(x)}Read: “The set of all x such that P(x) holds”
All values x that satisfy the condition P(x)
x ∈ ARead: “x is an element of A”
x is one of the things in set A
Quick Reference: Most Common Symbols
Mean
Std deviation
Summation
Tail exponent
Integration
Expected value
Probability
Variance
Distributed as