Rejection Sampling

Generate samples from a complex target distribution by using a scaled, simpler proposal distribution.

Concept Overview

Rejection Sampling is a Monte Carlo method used to generate independent samples from a target probability distribution that is difficult or impossible to sample from directly. The method involves sampling from a simpler "proposal" distribution, and then either accepting or rejecting those samples based on a scaled acceptance criterion.

Mathematical Definition

Let p(x) be the target probability density function we want to sample from, and q(x) be the proposal density function that is easy to sample from. We define a constant M such that for all x in the support of p(x):

p(x) ≤ M · q(x)

The algorithm proceeds as follows:

Sample a candidate value x from the proposal distribution q(x).
Sample a uniform random variable u from U(0, M · q(x)).
If u ≤ p(x), accept x as a sample from p(x).
Otherwise, reject x and repeat the process.

Key Concepts

The Envelope M · q(x)

The product M · q(x) acts as an "envelope" that strictly upper bounds the target distribution p(x). If M · q(x) is ever less than p(x), the sampling process will be fundamentally flawed, as it will never generate enough samples in those specific regions, leading to a biased, incorrect output.

Acceptance Rate

The efficiency of Rejection Sampling heavily relies on the choice of the proposal distribution q(x) and the constant M. The theoretical acceptance rate is exactly 1/M. Therefore, to minimize wasted computational effort (rejected samples), M should be chosen as close to 1 as possible, meaning q(x) should closely mimic the shape of p(x).

Curse of Dimensionality

While Rejection Sampling works excellently for 1-dimensional and 2-dimensional distributions, it struggles in higher dimensions. As dimensionality increases, the volume of the proposal envelope M · q(x) grows exponentially larger than the volume of the target distribution p(x). This makes the acceptance rate plummet to practically zero, rendering the algorithm computationally useless for high-dimensional problems.

Historical Context

Rejection sampling was formulated by mathematician John von Neumann in 1951. During his work on the Manhattan Project at Los Alamos and subsequent research utilizing the ENIAC computer, he helped pioneer the broader field of Monte Carlo methods. He introduced the rejection method as an exact, elegant way to generate pseudo-random numbers following complex analytical curves which could not be modeled using standard inverse transform sampling.

Real-world Applications

Computer Graphics: Used in rendering algorithms, such as path tracing, to generate light rays and sampling BRDFs (Bidirectional Reflectance Distribution Functions).
Bayesian Statistics: Applied to approximate complex posterior distributions when analytical methods or standard direct sampling are intractable.
Simulation: Generating events with arbitrary, complex arrival time distributions in discrete-event simulation engines.
Machine Learning: Creating synthetic datasets based on non-standard target demographic profiles or physics-based constraints.

Related Concepts

Markov Chain Monte Carlo — an alternative sampling method that scales much better to high dimensions.
Monte Carlo Simulation — the general family of algorithms utilizing repeated random sampling.
Probability Distributions — the theoretical foundation defining the target and proposal functions.

Rejection Sampling