Mathematical Notation Glossary

A reference guide for symbols and notation used throughout the course. New to mathematical notation? Start with the Greek letters and common patterns.

Showing 43 of 43 entries

mu

(“mew”)greek

The Greek letter mu, commonly used to represent the population mean or expected value of a distribution.

μ = E[X]X ~ N(μ, σ²)

sigma

(“sig-ma”)greek

The Greek letter sigma, used for standard deviation (σ) or variance (σ²). Measures the spread of a distribution around its mean.

σ = √Var(X)σ² = E[(X - μ)²]

capital sigma

(“sig-ma”)greek

Capital sigma indicates summation - adding up a series of terms. The subscript and superscript specify the range.

Σᵢ₌₁ⁿ xᵢ means x₁ + x₂ + ... + xₙ

alpha

(“al-fa”)greek

The Greek letter alpha, often used for the tail exponent in power law distributions, or as a general parameter.

P(X > x) ~ x^(-α)α-stable distribution

beta

(“bay-ta”)greek

The Greek letter beta, commonly used as a parameter in distributions or regression coefficients.

Beta(α, β) distribution

gamma

(“gam-a”)greek

The Greek letter gamma, often used to represent skewness (third standardized moment) or as a parameter.

γ = E[(X-μ)³]/σ³

kappa

(“kap-a”)greek

The Greek letter kappa, sometimes used for kurtosis (fourth standardized moment).

κ = E[(X-μ)⁴]/σ⁴

capital omega

(“oh-may-ga”)greek

Capital omega represents the sample space - the set of all possible outcomes of a random experiment.

ω ∈ ΩX: Ω → ℝ

delta

(“del-ta”)greek

The Greek letter delta, commonly used to represent a small change or difference in a quantity.

δx = x₂ - x₁δ-function

capital gamma

(“gam-a”)greek

Capital gamma denotes the gamma function Γ(n), which extends factorial to non-integers: Γ(n) = (n-1)! for positive integers.

Γ(n+1) = n!Γ(1/2) = √π

xi

(“zy or ksee”)greek

The Greek letter xi, commonly used as the shape parameter in Extreme Value Theory (GEV distribution). Determines tail behavior.

ξ > 0: Fréchet (fat tails)ξ = 0: Gumbelξ < 0: Weibull

nu

(“new”)greek

The Greek letter nu, commonly used for degrees of freedom in t-distributions and chi-squared distributions.

t(ν) distributionχ²(ν) distribution

lambda

(“lam-da”)greek

The Greek letter lambda, often used as a rate parameter in Poisson or exponential distributions.

Poisson(λ)Exponential(λ)

pi

(“pie”)greek

The mathematical constant pi, the ratio of a circle's circumference to its diameter. Appears frequently in probability density functions.

Normal PDF: (1/√(2π))e^(-x²/2)

∞

infinity

(“infinity”)general

The infinity symbol represents an unbounded quantity. Often appears in integration limits.

∫₋∞^∞ f(x)dx = 1lim_{n→∞}

∈

element of

set

Indicates membership in a set. "x ∈ A" means "x is an element of set A" or "x is in A".

x ∈ ℝ means x is a real number

∀

for all

set

Universal quantifier. "∀x" means "for all x" or "for every x".

∀x > 0, P(X > x) ≥ 0

∃

there exists

set

Existential quantifier. "∃x" means "there exists an x" or "for some x".

∃x such that f(x) = 0

∅

empty set

set

The empty set, a set containing no elements. Also written as {}.

P(∅) = 0

⊂

subset

set

Indicates that one set is contained within another. "A ⊂ B" means "A is a subset of B".

ℕ ⊂ ℤ ⊂ ℚ ⊂ ℝ

∪

union

set

Set union. "A ∪ B" is the set of elements in A or B (or both).

P(A ∪ B) = P(A) + P(B) - P(A ∩ B)

∩

intersection

set

Set intersection. "A ∩ B" is the set of elements in both A and B.

P(A ∩ B) = P(A)P(B) if independent

ℝ

real numbers

set

The set of all real numbers, including rational and irrational numbers. Continuous random variables typically take values in ℝ or subsets of ℝ.

X: Ω → ℝ

∫

integral

calculus

The integral sign indicates integration - a continuous analog of summation. Used extensively for computing probabilities and expectations.

∫₀¹ f(x)dxE[X] = ∫xf(x)dx

lim

limit

calculus

The limit of a function or sequence. "lim_{x→a} f(x)" is the value f(x) approaches as x gets closer to a.

lim_{n→∞} (1 + 1/n)ⁿ = e

∂

partial

calculus

Indicates a partial derivative - the derivative with respect to one variable while holding others constant.

∂f/∂x

probability

The probability function. P(A) gives the probability of event A occurring.

P(X > 0)P(A | B)

expectation

probability

The expectation operator. E[X] is the expected value (mean) of random variable X.

E[X] = ∫xf(x)dxE[X²]

Var

variance

probability

The variance of a random variable, measuring its spread around the mean. Var(X) = E[(X - μ)²].

Var(X) = E[X²] - (E[X])²

distributed as

probability

Indicates the distribution of a random variable. "X ~ N(0,1)" means "X is distributed as standard normal".

X ~ N(μ, σ²)X ~ Exp(λ)

f(x)

PDF

probability

The probability density function (for continuous distributions). Describes the relative likelihood of different values.

f(x) = (1/√(2π))e^(-x²/2)

F(x)

CDF

probability

The cumulative distribution function. F(x) = P(X ≤ x) gives the probability of being at or below x.

F(x) = ∫₋∞ˣ f(t)dt

S(x)

Survival function

probability

The survival function (also called tail function). S(x) = P(X > x) = 1 - F(x) gives the probability of exceeding x.

S(x) = 1 - F(x)S(x) = P(X > x)

Cov

covariance

probability

The covariance measures the joint variability of two random variables. Cov(X,Y) = E[(X - μX)(Y - μY)].

Cov(X,Y) = E[XY] - E[X]E[Y]Cov(X,X) = Var(X)

i.i.d.

probability

Independent and identically distributed: random variables that are mutually independent and share the same probability distribution.

X₁, X₂, ..., Xₙ i.i.d. ~ N(0,1)

a.s.

almost surely

probability

Almost surely (or almost certain) means an event happens with probability 1. Used in convergence theorems.

Xₙ → X a.s.P(event) = 1

⊥

independence

probability

The independence symbol indicates that random variables or events are statistically independent.

X ⊥ YA ⊥ B means P(A∩B) = P(A)P(B)

asymptotic equivalence

(“is asymptotically equivalent to”)calculus

f(x) ~ g(x) means lim f(x)/g(x) = 1 as x → ∞. The functions become indistinguishable in ratio for large x. Used extensively to describe tail behavior.

P(X > x) ~ x^{-α}n! ~ √(2πn)(n/e)^n (Stirling)

O(·)

big-O

(“big-O of”)calculus

f(x) = O(g(x)) means |f(x)| ≤ C|g(x)| for some constant C and sufficiently large x. It provides an upper bound on the growth rate.

x² + x = O(x²)sin(x) = O(1)

o(·)

little-o

(“little-o of”)calculus

f(x) = o(g(x)) means f(x)/g(x) → 0 as x → ∞. The function f becomes negligible compared to g. Stronger than big-O.

x = o(x²)ln(x) = o(x^ε) for any ε > 0

L(x)

slowly varying function

probability

A function L(x) is slowly varying if L(tx)/L(x) → 1 for all t > 0. Examples include constants, ln(x), and (ln x)^k. Used in fat tail theory to capture deviations from pure power laws.

S(x) = x^{-α}L(x)ln(x) is slowly varying

RV_α

regularly varying

probability

A function f(x) is regularly varying with index α if f(tx)/f(x) → t^α for all t > 0. Can be written as f(x) = x^α L(x) where L is slowly varying. Characterizes fat-tailed distributions.

S(x) = x^{-2}ln(x) is RV_{-2}Pareto survival is RV_{-α}

D(G)

domain of attraction

probability

A distribution F is in the domain of attraction of G (written F ∈ D(G)) if the maximum of n i.i.d. samples from F, properly normalized, converges to G. Determines which extreme value type applies.

Pareto ∈ D(Fréchet)Gaussian ∈ D(Gumbel)

Common Notation Patterns

These patterns show how to read common mathematical expressions. Practice reading them aloud to build fluency.

Probability Notation

P(X > x)

Read: “The probability that X is greater than x”

What fraction of the time does X exceed the value x?

P(a ≤ X ≤ b)

Read: “The probability that X is between a and b”

What fraction of the time is X in the range from a to b?

P(A | B)

Read: “The probability of A given B”

If we know B happened, what's the probability of A?

E[X]

Read: “The expected value of X”

The average value of X over many observations

E[g(X)]

Read: “The expected value of g of X”

The average value of the function g applied to X

Var(X)

Read: “The variance of X”

How spread out is X around its mean?

X ~ N(μ, σ²)

Read: “X is normally distributed with mean μ and variance σ²”

X follows the famous bell curve centered at μ

X ~ Exponential(λ)

Read: “X is exponentially distributed with rate λ”

X follows the exponential distribution, modeling waiting times

X ~ Pareto(α, xₘ)

Read: “X is Pareto distributed with tail exponent α and scale xₘ”

X follows a power law distribution with heavy tails

P(A ∩ B)

Read: “The probability of A and B both occurring”

Probability that both events A and B happen

P(A ∪ B)

Read: “The probability of A or B occurring”

Probability that at least one of A or B happens

Cov(X,Y)

Read: “The covariance of X and Y”

How much X and Y vary together

F ∈ D(G)

Read: “F is in the domain of attraction of G”

Maxima from F converge to the extreme value distribution G

S(x) = x^{-α}L(x)

Read: “S of x equals x to the negative alpha times L of x”

The survival function is a power law with slowly varying correction L(x)

Function Notation

f(x)

Read: “f of x”

The output of function f when given input x

f: A → B

Read: “f maps from A to B”

Function f takes inputs from set A and produces outputs in set B

lim_{x→∞} f(x)

Read: “The limit of f(x) as x approaches infinity”

What value does f(x) approach as x gets arbitrarily large?

lim_{x→a} f(x)

Read: “The limit of f(x) as x approaches a”

What value does f(x) approach as x gets closer and closer to a?

n → ∞

Read: “n approaches infinity”

n grows without bound

f(x) ~ g(x)

Read: “f is asymptotically equivalent to g”

The ratio f(x)/g(x) approaches 1 as x → ∞

f(x) = O(g(x))

Read: “f is big-O of g”

f is bounded above by a constant times g for large x

f(x) = o(g(x))

Read: “f is little-o of g”

f becomes negligible compared to g as x → ∞

Integration

∫ f(x) dx

Read: “The integral of f(x) with respect to x”

Add up all the infinitesimally small pieces of f(x)

∫ₐᵇ f(x) dx

Read: “The integral of f(x) from a to b”

Add up f(x) values for all x between a and b

∫₋∞^∞ f(x) dx

Read: “The integral of f(x) from negative infinity to positive infinity”

Add up f(x) over all possible values of x

Set Notation

{x | P(x)}

Read: “The set of all x such that P(x) holds”

All values x that satisfy the condition P(x)

x ∈ A

Read: “x is an element of A”

x is one of the things in set A

Quick Reference: Most Common Symbols

Mean

sigma

Std deviation

Sigma

Summation

alpha

Tail exponent

∫

integral

Integration

E[X]

Expected value

P(·)

Probability

Var(X)

Variance

Distributed as