TurboQuant vs Green Vectors
TurboQuant is a recent, near-optimal vector quantization algorithm that compresses each vector by randomly rotating it and applying optimal per-coordinate quantizers. Green Vectors takes a categorically different approach: instead of compressing each vector, it eliminates redundant vectors entirely. TurboQuant reduces the size of every vector, including redundant ones; Green Vectors reduces how many vectors exist. For an index bloated with near-duplicate vectors, removing the redundancy addresses a cause that even near-optimal quantization can only compress, not eliminate.
What TurboQuant is
TurboQuant, introduced in 2025 by Amir Zandieh and colleagues at Google Research, Google DeepMind, and NYU, is an online vector quantization algorithm designed for large language model KV-cache compression, nearest neighbor search, and vector databases. Its central idea is elegant: randomly rotate each vector before quantizing. The rotation spreads information evenly across coordinates, so that after rotation each coordinate follows a known distribution and the coordinates are nearly independent in high dimensions. This lets an optimal scalar quantizer be applied to each coordinate independently. It comes in two forms, one optimized for mean squared error and one for unbiased inner-product estimation, and it is data-oblivious, requiring no per-dataset training or calibration.
Why TurboQuant is a strong quantizer
TurboQuant is, by design, close to the best any quantizer can do. It approaches the information-theoretic lower bound on distortion for a given bit budget, provably within a small constant factor. That is a genuine achievement: it means TurboQuant extracts nearly all of the compression available from an individual vector at a given precision, without the per-dataset training that methods like product quantization require. As a quantizer, it is excellent.
The ceiling TurboQuant reveals
That same near-optimality exposes the limit of quantization itself. Because TurboQuant is already close to the theoretical floor, it shows that quantization as a category is nearly out of room: you cannot compress a vector below the information it carries without losing accuracy, and TurboQuant is near that boundary. Crucially, that boundary governs only the compression of the vectors you have. It says nothing about how many vectors you have. If an index contains many near-duplicate, redundant vectors, TurboQuant faithfully compresses every one of them. The redundancy is preserved, simply in smaller form, and no further quantization advance can remove it, because removing it is not what quantization does.
What Green Vectors does differently
Green Vectors operates on a different axis entirely. Rather than compressing each vector, it eliminates semantic redundancy at ingestion through patent-pending transformation, collapsing redundant vectors into single representations while keeping each remaining vector at full precision. It is not bounded by the quantization floor because it is not quantizing. It removes the redundant vectors that a quantizer would otherwise spend bits compressing. The outcome is lower storage with preserved or improved accuracy, rather than lower storage at the cost of accuracy. In benchmarked workloads, Green Vectors reduced vector count by up to 99.5% while improving search quality by up to 59%.