Markov Chains

Visualize state transitions, probability distributions, and steady states in stochastic systems.

Concept Overview

A Markov chain is a stochastic model describing a sequence of possible events in which the probability of each event depends only on the state attained in the previous event. This mathematical property—that the future depends only on the present, and not on the past—is known as the Markov property. Markov chains are fundamental to probability theory and have far-reaching applications in fields ranging from economics and genetics to computer science and artificial intelligence.

Mathematical Definition

A sequence of random variables X₀, X₁, X₂, ... is a Markov chain if it satisfies the Markov property:

P(X_n+1 = x | X_n = x_n, ..., X₀ = x₀) = P(X_n+1 = x | X_n = x_n)

For a discrete-time Markov chain with a finite number of states, the transition probabilities from state i to state j are organized into a transition matrix P:

P_ij = P(X_n+1 = j | X_n = i)

where the rows sum to 1:

Σ_j P_ij = 1

Key Concepts

State Distribution Over Time

Let v_n be a row vector representing the probability distribution across states at time step n. The distribution at the next time step is obtained by multiplying v_n by the transition matrix P:

v_n+1 = v_n P

By induction:

v_n = v₀ Pⁿ

Steady-State Distribution

Under certain conditions (when the chain is irreducible and aperiodic), the distribution v_n converges to a unique stationary distribution π as n approaches infinity, regardless of the initial distribution v₀. The steady-state vector π satisfies:

π = π P

And must sum to 1:

Σ_i π_i = 1

Notice that π is a left eigenvector of the transition matrix P with eigenvalue λ = 1. For a 2-state Markov chain with states A and B, where P(A → B) = p and P(B → A) = q, the steady-state distribution is:

π_A = q / (p + q)

π_B = p / (p + q)

Irreducibility and Periodicity

A Markov chain is irreducible if it is possible to get from any state to any other state (possibly in more than one step). A state is periodic if returns to that state can only occur at multiples of some integer greater than 1. An irreducible, positive recurrent, and aperiodic Markov chain is called ergodic, which guarantees convergence to a unique stationary distribution.

Historical Context

The theory of Markov chains is named after the Russian mathematician Andrey Markov, who introduced them in a 1906 paper. Markov originally developed the theory to prove that the Law of Large Numbers could apply to dependent random variables, extending previous work that required independence.

In 1913, Markov famously applied his chains to study the alternation of vowels and consonants in Alexander Pushkin's novel Eugene Onegin. This is considered one of the first practical applications of the theory, anticipating modern natural language processing techniques by nearly a century.

Real-world Applications

Google's PageRank Algorithm: The original PageRank algorithm models web surfing as a Markov chain. The PageRank of a webpage is proportional to its probability in the steady-state distribution of this chain.
Hidden Markov Models (HMMs): Used extensively in speech recognition, bioinformatics (like gene prediction), and time-series analysis where the underlying states are unobserved.
Operations Research: Modeling queuing systems, inventory management, and machine maintenance (predicting when a machine will break down based on its current wear state).
Finance and Economics: Modeling stock market trends (e.g., transitions between bull, bear, and stagnant markets) and credit rating migrations.
Generative AI: Markov chain concepts are foundational to Markov Chain Monte Carlo (MCMC) methods used in Bayesian inference, and inspired early text generation models before modern neural networks.

Related Concepts

Eigenvalues & Eigenvectors — finding the steady-state vector involves computing the eigenvector corresponding to an eigenvalue of 1.
Matrix Multiplication — the state distribution is updated by multiplying the state vector by the transition matrix.
Monte Carlo Simulation — generating random samples based on transition probabilities to simulate complex systems over time.

Markov Chains

Markov Chains

Concept Overview

Mathematical Definition

Key Concepts

State Distribution Over Time

Steady-State Distribution

Irreducibility and Periodicity

Historical Context

Real-world Applications

Related Concepts

Experience it interactively

More in Linear Algebra

Bezier Curves

Matrix Multiplication

Dot Product & Projection