📊

Lloyd-Max Scalar Quantization

The optimal scalar quantizer for Gaussian distributions. Iteratively refines centroids and decision boundaries to minimize mean squared error.

The Problem

Given a continuous random variable and a budget of levels (16 at 4 bits), find reconstruction values (centroids) and decision boundaries that minimize mean squared error:

The Algorithm

The Lloyd-Max algorithm (Lloyd 1982, Max 1960) iteratively refines centroids and boundaries until convergence (~200 iterations):

Initialize

Place centroids uniformly across the distribution's range

Update boundaries (nearest-neighbor rule)

Update centroids (conditional expectation)

Repeat

Until MSE converges

Interactive Visualization

The Gaussian curve below shows the 16 Lloyd-Max centroids (vertical lines) and decision boundaries for . Each centroid is the conditional mean of its partition region — placing more levels where the density is highest.

Gaussian 𝒩(0,1) Centroids Boundaries

The 16-Entry Codebook

At 4 bits, the codebook is just 16 float32 values (64 bytes) — shared across all layers. This is optimal: no other scalar quantizer with the same number of levels achieves lower MSE for the Gaussian distribution.

0-2.40

1-1.84

2-1.44

3-1.10

4-0.80

5-0.52

6-0.26

70.00

8+0.26

9+0.52

10+0.80

11+1.10

12+1.44

13+1.84

14+2.40

15+2.40

Why It Works for TurboQuant

After rotation, each weight coordinate is approximately . The Lloyd-Max codebook for this distribution is computed once and reused everywhere — achieving overall distortion within 2.7× of the information-theoretic lower bound (Shannon rate-distortion).

Implementation

codebook.py → _compute_lloyd_max_gaussian()

← Home Random Rotation →