Vector Compression vs Vector Reduction
Vector compression reduces the size of each individual vector by lowering its precision, for example through quantization, which saves storage but loses information and can reduce accuracy. Vector reduction reduces the number of vectors stored by eliminating semantic redundancy, so each remaining vector keeps full precision. In short, compression makes vectors smaller, reduction makes them fewer. The two are complementary and can be combined.
How they differ
Compression operates on each vector, shrinking it by lowering precision. Reduction operates on the set of vectors, removing those that are semantically redundant while keeping the rest at full precision. Compression trades accuracy for size; reduction removes redundancy without that tradeoff.
Using both together
They operate on different axes and can be layered. In practice, once vectors are reduced, compression often adds little, and benchmarking showed reduction alone outperforming the combination. Compression remains available but becomes optional rather than additive.