Mathematical Notation Glossary

A reference guide for symbols and notation used throughout the course. New to mathematical notation? Start with the Greek letters and common patterns.

Showing 43 of 43 entries

μ

mu

(“mew”)greek

The Greek letter mu, commonly used to represent the population mean or expected value of a distribution.

μ = E[X]X ~ N(μ, σ²)
σ

sigma

(“sig-ma”)greek

The Greek letter sigma, used for standard deviation (σ) or variance (σ²). Measures the spread of a distribution around its mean.

σ = √Var(X)σ² = E[(X - μ)²]
Σ

capital sigma

(“sig-ma”)greek

Capital sigma indicates summation - adding up a series of terms. The subscript and superscript specify the range.

Σᵢ₌₁ⁿ xᵢ means x₁ + x₂ + ... + xₙ
α

alpha

(“al-fa”)greek

The Greek letter alpha, often used for the tail exponent in power law distributions, or as a general parameter.

P(X > x) ~ x^(-α)α-stable distribution
β

beta

(“bay-ta”)greek

The Greek letter beta, commonly used as a parameter in distributions or regression coefficients.

Beta(α, β) distribution
γ

gamma

(“gam-a”)greek

The Greek letter gamma, often used to represent skewness (third standardized moment) or as a parameter.

γ = E[(X-μ)³]/σ³
κ

kappa

(“kap-a”)greek

The Greek letter kappa, sometimes used for kurtosis (fourth standardized moment).

κ = E[(X-μ)⁴]/σ⁴
Ω

capital omega

(“oh-may-ga”)greek

Capital omega represents the sample space - the set of all possible outcomes of a random experiment.

ω ∈ ΩX: Ω → ℝ
δ

delta

(“del-ta”)greek

The Greek letter delta, commonly used to represent a small change or difference in a quantity.

δx = x₂ - x₁δ-function
Γ

capital gamma

(“gam-a”)greek

Capital gamma denotes the gamma function Γ(n), which extends factorial to non-integers: Γ(n) = (n-1)! for positive integers.

Γ(n+1) = n!Γ(1/2) = √π
ξ

xi

(“zy or ksee”)greek

The Greek letter xi, commonly used as the shape parameter in Extreme Value Theory (GEV distribution). Determines tail behavior.

ξ > 0: Fréchet (fat tails)ξ = 0: Gumbelξ < 0: Weibull
ν

nu

(“new”)greek

The Greek letter nu, commonly used for degrees of freedom in t-distributions and chi-squared distributions.

t(ν) distributionχ²(ν) distribution
λ

lambda

(“lam-da”)greek

The Greek letter lambda, often used as a rate parameter in Poisson or exponential distributions.

Poisson(λ)Exponential(λ)
π

pi

(“pie”)greek

The mathematical constant pi, the ratio of a circle's circumference to its diameter. Appears frequently in probability density functions.

Normal PDF: (1/√(2π))e^(-x²/2)

infinity

(“infinity”)general

The infinity symbol represents an unbounded quantity. Often appears in integration limits.

∫₋∞^∞ f(x)dx = 1lim_{n→∞}

element of

set

Indicates membership in a set. "x ∈ A" means "x is an element of set A" or "x is in A".

x ∈ ℝ means x is a real number

for all

set

Universal quantifier. "∀x" means "for all x" or "for every x".

∀x > 0, P(X > x) ≥ 0

there exists

set

Existential quantifier. "∃x" means "there exists an x" or "for some x".

∃x such that f(x) = 0

empty set

set

The empty set, a set containing no elements. Also written as {}.

P(∅) = 0

subset

set

Indicates that one set is contained within another. "A ⊂ B" means "A is a subset of B".

ℕ ⊂ ℤ ⊂ ℚ ⊂ ℝ

union

set

Set union. "A ∪ B" is the set of elements in A or B (or both).

P(A ∪ B) = P(A) + P(B) - P(A ∩ B)

intersection

set

Set intersection. "A ∩ B" is the set of elements in both A and B.

P(A ∩ B) = P(A)P(B) if independent

real numbers

set

The set of all real numbers, including rational and irrational numbers. Continuous random variables typically take values in ℝ or subsets of ℝ.

X: Ω → ℝ

integral

calculus

The integral sign indicates integration - a continuous analog of summation. Used extensively for computing probabilities and expectations.

∫₀¹ f(x)dxE[X] = ∫xf(x)dx
lim

limit

calculus

The limit of a function or sequence. "lim_{x→a} f(x)" is the value f(x) approaches as x gets closer to a.

lim_{n→∞} (1 + 1/n)ⁿ = e

partial

calculus

Indicates a partial derivative - the derivative with respect to one variable while holding others constant.

∂f/∂x
P

probability

probability

The probability function. P(A) gives the probability of event A occurring.

P(X > 0)P(A | B)
E

expectation

probability

The expectation operator. E[X] is the expected value (mean) of random variable X.

E[X] = ∫xf(x)dxE[X²]
Var

variance

probability

The variance of a random variable, measuring its spread around the mean. Var(X) = E[(X - μ)²].

Var(X) = E[X²] - (E[X])²
~

distributed as

probability

Indicates the distribution of a random variable. "X ~ N(0,1)" means "X is distributed as standard normal".

X ~ N(μ, σ²)X ~ Exp(λ)
f(x)

PDF

probability

The probability density function (for continuous distributions). Describes the relative likelihood of different values.

f(x) = (1/√(2π))e^(-x²/2)
F(x)

CDF

probability

The cumulative distribution function. F(x) = P(X ≤ x) gives the probability of being at or below x.

F(x) = ∫₋∞ˣ f(t)dt
S(x)

Survival function

probability

The survival function (also called tail function). S(x) = P(X > x) = 1 - F(x) gives the probability of exceeding x.

S(x) = 1 - F(x)S(x) = P(X > x)
Cov

covariance

probability

The covariance measures the joint variability of two random variables. Cov(X,Y) = E[(X - μX)(Y - μY)].

Cov(X,Y) = E[XY] - E[X]E[Y]Cov(X,X) = Var(X)
i.i.d.

i.i.d.

probability

Independent and identically distributed: random variables that are mutually independent and share the same probability distribution.

X₁, X₂, ..., Xₙ i.i.d. ~ N(0,1)
a.s.

almost surely

probability

Almost surely (or almost certain) means an event happens with probability 1. Used in convergence theorems.

Xₙ → X a.s.P(event) = 1

independence

probability

The independence symbol indicates that random variables or events are statistically independent.

X ⊥ YA ⊥ B means P(A∩B) = P(A)P(B)
~

asymptotic equivalence

(“is asymptotically equivalent to”)calculus

f(x) ~ g(x) means lim f(x)/g(x) = 1 as x → ∞. The functions become indistinguishable in ratio for large x. Used extensively to describe tail behavior.

P(X > x) ~ x^{-α}n! ~ √(2πn)(n/e)^n (Stirling)
O(·)

big-O

(“big-O of”)calculus

f(x) = O(g(x)) means |f(x)| ≤ C|g(x)| for some constant C and sufficiently large x. It provides an upper bound on the growth rate.

x² + x = O(x²)sin(x) = O(1)
o(·)

little-o

(“little-o of”)calculus

f(x) = o(g(x)) means f(x)/g(x) → 0 as x → ∞. The function f becomes negligible compared to g. Stronger than big-O.

x = o(x²)ln(x) = o(x^ε) for any ε > 0
L(x)

slowly varying function

probability

A function L(x) is slowly varying if L(tx)/L(x) → 1 for all t > 0. Examples include constants, ln(x), and (ln x)^k. Used in fat tail theory to capture deviations from pure power laws.

S(x) = x^{-α}L(x)ln(x) is slowly varying
RV_α

regularly varying

probability

A function f(x) is regularly varying with index α if f(tx)/f(x) → t^α for all t > 0. Can be written as f(x) = x^α L(x) where L is slowly varying. Characterizes fat-tailed distributions.

S(x) = x^{-2}ln(x) is RV_{-2}Pareto survival is RV_{-α}
D(G)

domain of attraction

probability

A distribution F is in the domain of attraction of G (written F ∈ D(G)) if the maximum of n i.i.d. samples from F, properly normalized, converges to G. Determines which extreme value type applies.

Pareto ∈ D(Fréchet)Gaussian ∈ D(Gumbel)

Common Notation Patterns

These patterns show how to read common mathematical expressions. Practice reading them aloud to build fluency.

Probability Notation

P(X > x)

Read: The probability that X is greater than x

What fraction of the time does X exceed the value x?

P(a ≤ X ≤ b)

Read: The probability that X is between a and b

What fraction of the time is X in the range from a to b?

P(A | B)

Read: The probability of A given B

If we know B happened, what's the probability of A?

E[X]

Read: The expected value of X

The average value of X over many observations

E[g(X)]

Read: The expected value of g of X

The average value of the function g applied to X

Var(X)

Read: The variance of X

How spread out is X around its mean?

X ~ N(μ, σ²)

Read: X is normally distributed with mean μ and variance σ²

X follows the famous bell curve centered at μ

X ~ Exponential(λ)

Read: X is exponentially distributed with rate λ

X follows the exponential distribution, modeling waiting times

X ~ Pareto(α, xₘ)

Read: X is Pareto distributed with tail exponent α and scale xₘ

X follows a power law distribution with heavy tails

P(A ∩ B)

Read: The probability of A and B both occurring

Probability that both events A and B happen

P(A ∪ B)

Read: The probability of A or B occurring

Probability that at least one of A or B happens

Cov(X,Y)

Read: The covariance of X and Y

How much X and Y vary together

F ∈ D(G)

Read: F is in the domain of attraction of G

Maxima from F converge to the extreme value distribution G

S(x) = x^{-α}L(x)

Read: S of x equals x to the negative alpha times L of x

The survival function is a power law with slowly varying correction L(x)

Function Notation

f(x)

Read: f of x

The output of function f when given input x

f: A → B

Read: f maps from A to B

Function f takes inputs from set A and produces outputs in set B

lim_{x→∞} f(x)

Read: The limit of f(x) as x approaches infinity

What value does f(x) approach as x gets arbitrarily large?

lim_{x→a} f(x)

Read: The limit of f(x) as x approaches a

What value does f(x) approach as x gets closer and closer to a?

n → ∞

Read: n approaches infinity

n grows without bound

f(x) ~ g(x)

Read: f is asymptotically equivalent to g

The ratio f(x)/g(x) approaches 1 as x → ∞

f(x) = O(g(x))

Read: f is big-O of g

f is bounded above by a constant times g for large x

f(x) = o(g(x))

Read: f is little-o of g

f becomes negligible compared to g as x → ∞

Integration

∫ f(x) dx

Read: The integral of f(x) with respect to x

Add up all the infinitesimally small pieces of f(x)

∫ₐᵇ f(x) dx

Read: The integral of f(x) from a to b

Add up f(x) values for all x between a and b

∫₋∞^∞ f(x) dx

Read: The integral of f(x) from negative infinity to positive infinity

Add up f(x) over all possible values of x

Set Notation

{x | P(x)}

Read: The set of all x such that P(x) holds

All values x that satisfy the condition P(x)

x ∈ A

Read: x is an element of A

x is one of the things in set A

Quick Reference: Most Common Symbols

μ
mu

Mean

σ
sigma

Std deviation

Σ
Sigma

Summation

α
alpha

Tail exponent

integral

Integration

E[X]

Expected value

P(·)

Probability

Var(X)

Variance

~

Distributed as