Technical Comparison

    TurboQuant vs Green Vectors

    TurboQuant is a recent, near-optimal vector quantization algorithm that compresses each vector by randomly rotating it and applying optimal per-coordinate quantizers. Green Vectors takes a categorically different approach: instead of compressing each vector, it eliminates redundant vectors entirely. TurboQuant reduces the size of every vector, including redundant ones; Green Vectors reduces how many vectors exist. For an index bloated with near-duplicate vectors, removing the redundancy addresses a cause that even near-optimal quantization can only compress, not eliminate.

    What TurboQuant is

    TurboQuant, introduced in 2025 by Amir Zandieh and colleagues at Google Research, Google DeepMind, and NYU, is an online vector quantization algorithm designed for large language model KV-cache compression, nearest neighbor search, and vector databases. Its central idea is elegant: randomly rotate each vector before quantizing. The rotation spreads information evenly across coordinates, so that after rotation each coordinate follows a known distribution and the coordinates are nearly independent in high dimensions. This lets an optimal scalar quantizer be applied to each coordinate independently. It comes in two forms, one optimized for mean squared error and one for unbiased inner-product estimation, and it is data-oblivious, requiring no per-dataset training or calibration.

    Why TurboQuant is a strong quantizer

    TurboQuant is, by design, close to the best any quantizer can do. It approaches the information-theoretic lower bound on distortion for a given bit budget, provably within a small constant factor. That is a genuine achievement: it means TurboQuant extracts nearly all of the compression available from an individual vector at a given precision, without the per-dataset training that methods like product quantization require. As a quantizer, it is excellent.

    The ceiling TurboQuant reveals

    That same near-optimality exposes the limit of quantization itself. Because TurboQuant is already close to the theoretical floor, it shows that quantization as a category is nearly out of room: you cannot compress a vector below the information it carries without losing accuracy, and TurboQuant is near that boundary. Crucially, that boundary governs only the compression of the vectors you have. It says nothing about how many vectors you have. If an index contains many near-duplicate, redundant vectors, TurboQuant faithfully compresses every one of them. The redundancy is preserved, simply in smaller form, and no further quantization advance can remove it, because removing it is not what quantization does.

    What Green Vectors does differently

    Green Vectors operates on a different axis entirely. Rather than compressing each vector, it eliminates semantic redundancy at ingestion through patent-pending transformation, collapsing redundant vectors into single representations while keeping each remaining vector at full precision. It is not bounded by the quantization floor because it is not quantizing. It removes the redundant vectors that a quantizer would otherwise spend bits compressing. The outcome is lower storage with preserved or improved accuracy, rather than lower storage at the cost of accuracy. In benchmarked workloads, Green Vectors reduced vector count by up to 99.5% while improving search quality by up to 59%.

    How they compare

    TurboQuantGreen Vectors
    CategoryVector quantization (compression)Vector reduction
    What it changesBits per vectorNumber of vectors
    Each vectorSmaller, lower precisionFull precision, unchanged
    Redundant vectorsCompressed and retainedRemoved
    Accuracy effectNear-optimal but nonzero distortionPreserved or improved
    Per-dataset trainingNot requiredNot required
    Bounded by the quantization floorYes, near it by designNo, a different axis

    Do you still need TurboQuant with Green Vectors?

    For most workloads, no. Once redundancy is removed, there is far less left for any quantizer to compress, and reduction has already delivered the storage savings without the accuracy tradeoff. Quantization and reduction operate on different axes, so TurboQuant can be layered on if a pipeline already uses it. But with Green Vectors, quantization becomes optional rather than necessary. The most reliable way to confirm this for your workload is to evaluate first-pass results on your own data.

    FAQ

    Frequently asked questions.

    They solve different problems. TurboQuant is near-optimal at compressing each vector; Green Vectors removes redundant vectors entirely. For indexes with redundancy, reduction lowers storage while preserving accuracy, which compression cannot do. They work on different axes and can be layered, but reduction makes quantization optional for most workloads.
    No. Green Vectors is semantic transformation that reduces the number of vectors, not their precision. Each remaining vector stays at full precision.
    Conceptually yes, since they operate on different axes. But once Green Vectors removes redundancy, a quantizer typically has little left to improve, so it becomes optional.
    Compressing individual vectors close to the theoretical limit without per-dataset training. It is strong for KV-cache compression and memory-constrained search. It does not reduce the number of vectors stored.
    Because the bloat is often caused by redundant vectors, and quantization compresses redundancy rather than removing it. Even a near-optimal quantizer faithfully stores every redundant vector in smaller form.

    Related

    Remove redundancy instead of compressing it

    Kitana is in closed beta. Benchmark Green Vectors against your current quantization stack on your own data.