Computer Science

B-Tree Visualization

Visualize B-Tree node splitting, key insertion, and tree balancing with customizable degree parameters.

B-Tree

Concept Overview

A B-Tree is a self-balancing search tree in which nodes can have more than two children. Unlike a standard binary search tree, a B-Tree is optimized for systems that read and write large blocks of data. It maintains sorted data and allows searches, sequential access, insertions, and deletions in logarithmic time. B-Trees are particularly well-suited for storage systems, like databases and file systems, where reading a block of data from disk is a slow operation.

Mathematical Definition

A B-Tree of order (or minimum degree) t (where t ≥ 2) satisfies the following properties:

Every node has at most 2t - 1 keys.
Every node (except root) has at least t - 1 keys.
Every internal node with k keys has k + 1 children.
All leaves appear on the same level.

Height h of a B-Tree with n keys is bounded by:
h ≤ logt((n + 1) / 2)

Key Concepts

  • Node Splitting: When a node becomes full (contains 2t - 1 keys), it is split into two nodes, each containing t - 1 keys. The median key is promoted to the parent node. If the root is split, a new root is created, increasing the tree's height.
  • Insertion: Keys are always inserted into leaf nodes. If the target leaf is full, it must be split before insertion. To do this in a single pass from root to leaf, any full node encountered during traversal is split.
  • Degree (t): The minimum degree dictates the branching factor and occupancy constraints. A larger degree means nodes contain more keys and children, reducing the overall height of the tree.

Historical Context

The B-Tree was invented by Rudolf Bayer and Edward M. McCreight in 1971 while they were working at Boeing Research Labs. The origin of the letter "B" in B-Tree remains a subject of speculation; Bayer and McCreight never definitively explained it. It has been theorized to stand for "Boeing", "Bayer", "Balanced", or "Broad".

Real-world Applications

  • Relational Databases: SQL databases use B-Trees (or B+ Trees) to build indices, allowing fast lookups, insertions, and range queries on massive datasets stored on disk.
  • File Systems: Modern file systems such as NTFS, ext4, and APFS utilize variations of B-Trees to map file names to disk blocks efficiently.
  • Key-Value Stores: Some NoSQL storage engines use B-Trees as their primary underlying data structure for maintaining ordered keys.

Related Concepts

  • Red-Black Tree — a binary self-balancing tree that is isomorphic to a B-Tree of order 4 (2-3-4 tree).
  • B+ Tree — a variant of the B-Tree where all values are stored in the leaf level, and leaves are linked sequentially, optimizing range queries.
  • AVL Tree — another type of self-balancing binary search tree, which maintains strict height balance.

Experience it interactively

Adjust parameters, observe in real time, and build deep intuition with Riano’s interactive B-Tree Visualization module.

Try B-Tree Visualization on Riano →

More in Computer Science