๐Ÿ“Š

Lloyd-Max Scalar Quantization

The optimal scalar quantizer for Gaussian distributions. Iteratively refines centroids and decision boundaries to minimize mean squared error.

The Problem

Given a continuous random variable and a budget of levels (16 at 4 bits), find reconstruction values (centroids) and decision boundaries that minimize mean squared error:

The Algorithm

The Lloyd-Max algorithm (Lloyd 1982, Max 1960) iteratively refines centroids and boundaries until convergence (~200 iterations):

1
Initialize

Place centroids uniformly across the distribution's range

2
Update boundaries (nearest-neighbor rule)
3
Update centroids (conditional expectation)
4
Repeat

Until MSE converges

Interactive Visualization

The Gaussian curve below shows the 16 Lloyd-Max centroids (vertical lines) and decision boundaries for . Each centroid is the conditional mean of its partition region โ€” placing more levels where the density is highest.

Gaussian ๐’ฉ(0,1) Centroids Boundaries

The 16-Entry Codebook

At 4 bits, the codebook is just 16 float32 values (64 bytes) โ€” shared across all layers. This is optimal: no other scalar quantizer with the same number of levels achieves lower MSE for the Gaussian distribution.

0-2.40
1-1.84
2-1.44
3-1.10
4-0.80
5-0.52
6-0.26
70.00
8+0.26
9+0.52
10+0.80
11+1.10
12+1.44
13+1.84
14+2.40
15+2.40

Why It Works for TurboQuant

After rotation, each weight coordinate is approximately . The Lloyd-Max codebook for this distribution is computed once and reused everywhere โ€” achieving overall distortion within 2.7ร— of the information-theoretic lower bound (Shannon rate-distortion).

Implementation

codebook.py โ†’ _compute_lloyd_max_gaussian()