Linear Algebra

Markov Chains

Visualize state transitions, probability distributions, and steady states in stochastic systems.

Markov Chains

Concept Overview

A Markov chain is a stochastic model describing a sequence of possible events in which the probability of each event depends only on the state attained in the previous event. This mathematical property—that the future depends only on the present, and not on the past—is known as the Markov property. Markov chains are fundamental to probability theory and have far-reaching applications in fields ranging from economics and genetics to computer science and artificial intelligence.

Mathematical Definition

A sequence of random variables X0, X1, X2, ... is a Markov chain if it satisfies the Markov property:

P(Xn+1 = x | Xn = xn, ..., X0 = x0) = P(Xn+1 = x | Xn = xn)

For a discrete-time Markov chain with a finite number of states, the transition probabilities from state i to state j are organized into a transition matrix P:

Pij = P(Xn+1 = j | Xn = i)
where the rows sum to 1:
Σj Pij = 1

Key Concepts

State Distribution Over Time

Let vn be a row vector representing the probability distribution across states at time step n. The distribution at the next time step is obtained by multiplying vn by the transition matrix P:

vn+1 = vn P
By induction:
vn = v0 Pn

Steady-State Distribution

Under certain conditions (when the chain is irreducible and aperiodic), the distribution vn converges to a unique stationary distribution π as n approaches infinity, regardless of the initial distribution v0. The steady-state vector π satisfies:

π = π P
And must sum to 1:
Σi πi = 1

Notice that π is a left eigenvector of the transition matrix P with eigenvalue λ = 1. For a 2-state Markov chain with states A and B, where P(A → B) = p and P(B → A) = q, the steady-state distribution is:

πA = q / (p + q)
πB = p / (p + q)

Irreducibility and Periodicity

A Markov chain is irreducible if it is possible to get from any state to any other state (possibly in more than one step). A state is periodic if returns to that state can only occur at multiples of some integer greater than 1. An irreducible, positive recurrent, and aperiodic Markov chain is called ergodic, which guarantees convergence to a unique stationary distribution.

Historical Context

The theory of Markov chains is named after the Russian mathematician Andrey Markov, who introduced them in a 1906 paper. Markov originally developed the theory to prove that the Law of Large Numbers could apply to dependent random variables, extending previous work that required independence.

In 1913, Markov famously applied his chains to study the alternation of vowels and consonants in Alexander Pushkin's novel Eugene Onegin. This is considered one of the first practical applications of the theory, anticipating modern natural language processing techniques by nearly a century.

Real-world Applications

  • Google's PageRank Algorithm: The original PageRank algorithm models web surfing as a Markov chain. The PageRank of a webpage is proportional to its probability in the steady-state distribution of this chain.
  • Hidden Markov Models (HMMs): Used extensively in speech recognition, bioinformatics (like gene prediction), and time-series analysis where the underlying states are unobserved.
  • Operations Research: Modeling queuing systems, inventory management, and machine maintenance (predicting when a machine will break down based on its current wear state).
  • Finance and Economics: Modeling stock market trends (e.g., transitions between bull, bear, and stagnant markets) and credit rating migrations.
  • Generative AI: Markov chain concepts are foundational to Markov Chain Monte Carlo (MCMC) methods used in Bayesian inference, and inspired early text generation models before modern neural networks.

Related Concepts

  • Eigenvalues & Eigenvectors — finding the steady-state vector involves computing the eigenvector corresponding to an eigenvalue of 1.
  • Matrix Multiplication — the state distribution is updated by multiplying the state vector by the transition matrix.
  • Monte Carlo Simulation — generating random samples based on transition probabilities to simulate complex systems over time.

Experience it interactively

Adjust parameters, observe in real time, and build deep intuition with Riano’s interactive Markov Chains module.

Try Markov Chains on Riano →

More in Linear Algebra