Is Green Vectors better than TurboQuant?

They solve different problems. TurboQuant is near-optimal at compressing each vector; Green Vectors removes redundant vectors entirely. For indexes with redundancy, reduction lowers storage while preserving accuracy, which compression cannot do. They work on different axes and can be layered, but reduction makes quantization optional for most workloads.

Does Green Vectors use quantization?

No. Green Vectors is semantic transformation that reduces the number of vectors, not their precision. Each remaining vector stays at full precision.

Can I use TurboQuant and Green Vectors together?

Conceptually yes, since they operate on different axes. But once Green Vectors removes redundancy, a quantizer typically has little left to improve, so it becomes optional.

What is TurboQuant best at?

Compressing individual vectors close to the theoretical limit without per-dataset training. It is strong for KV-cache compression and memory-constrained search. It does not reduce the number of vectors stored.

Why doesn't near-optimal quantization solve index bloat?

Because the bloat is often caused by redundant vectors, and quantization compresses redundancy rather than removing it. Even a near-optimal quantizer faithfully stores every redundant vector in smaller form.

Technical Comparison

TurboQuant vs Green Vectors

TurboQuant is a recent, near-optimal vector quantization algorithm that compresses each vector by randomly rotating it and applying optimal per-coordinate quantizers. Green Vectors takes a categorically different approach: instead of compressing each vector, it eliminates redundant vectors entirely. TurboQuant reduces the size of every vector, including redundant ones; Green Vectors reduces how many vectors exist. For an index bloated with near-duplicate vectors, removing the redundancy addresses a cause that even near-optimal quantization can only compress, not eliminate.

What TurboQuant is

TurboQuant, introduced in 2025 by Amir Zandieh and colleagues at Google Research, Google DeepMind, and NYU, is an online vector quantization algorithm designed for large language model KV-cache compression, nearest neighbor search, and vector databases. Its central idea is elegant: randomly rotate each vector before quantizing. The rotation spreads information evenly across coordinates, so that after rotation each coordinate follows a known distribution and the coordinates are nearly independent in high dimensions. This lets an optimal scalar quantizer be applied to each coordinate independently. It comes in two forms, one optimized for mean squared error and one for unbiased inner-product estimation, and it is data-oblivious, requiring no per-dataset training or calibration.

Why TurboQuant is a strong quantizer

TurboQuant is, by design, close to the best any quantizer can do. It approaches the information-theoretic lower bound on distortion for a given bit budget, provably within a small constant factor. That is a genuine achievement: it means TurboQuant extracts nearly all of the compression available from an individual vector at a given precision, without the per-dataset training that methods like product quantization require. As a quantizer, it is excellent.

The ceiling TurboQuant reveals

That same near-optimality exposes the limit of quantization itself. Because TurboQuant is already close to the theoretical floor, it shows that quantization as a category is nearly out of room: you cannot compress a vector below the information it carries without losing accuracy, and TurboQuant is near that boundary. Crucially, that boundary governs only the compression of the vectors you have. It says nothing about how many vectors you have. If an index contains many near-duplicate, redundant vectors, TurboQuant faithfully compresses every one of them. The redundancy is preserved, simply in smaller form, and no further quantization advance can remove it, because removing it is not what quantization does.

What Green Vectors does differently

Green Vectors operates on a different axis entirely. Rather than compressing each vector, it eliminates semantic redundancy at ingestion through patent-pending transformation, collapsing redundant vectors into single representations while keeping each remaining vector at full precision. It is not bounded by the quantization floor because it is not quantizing. It removes the redundant vectors that a quantizer would otherwise spend bits compressing. The outcome is lower storage with preserved or improved accuracy, rather than lower storage at the cost of accuracy. In benchmarked workloads, Green Vectors reduced vector count by up to 99.5% while improving search quality by up to 59%.

How they compare

	TurboQuant	Green Vectors
Category	Vector quantization (compression)	Vector reduction
What it changes	Bits per vector	Number of vectors
Each vector	Smaller, lower precision	Full precision, unchanged
Redundant vectors	Compressed and retained	Removed
Accuracy effect	Near-optimal but nonzero distortion	Preserved or improved
Per-dataset training	Not required	Not required
Bounded by the quantization floor	Yes, near it by design	No, a different axis

Do you still need TurboQuant with Green Vectors?

For most workloads, no. Once redundancy is removed, there is far less left for any quantizer to compress, and reduction has already delivered the storage savings without the accuracy tradeoff. Quantization and reduction operate on different axes, so TurboQuant can be layered on if a pipeline already uses it. But with Green Vectors, quantization becomes optional rather than necessary. The most reliable way to confirm this for your workload is to evaluate first-pass results on your own data.