RAG :: RERANKING

    What Is Reranking in RAG?

    Reranking in RAG is a second-stage step that reorders the documents returned by initial retrieval to improve relevance. After a fast first-stage retrieval returns candidate documents, a reranker, typically a cross-encoder, scores each candidate against the query and reorders them so the most relevant appear first. Reranking improves the order of results but cannot recover documents that retrieval missed.

    How reranking works

    First-stage retrieval is optimized for speed and returns a broad candidate set. A reranker examines each query-document pair in detail, which is more accurate but more expensive, and assigns relevance scores used to reorder candidates. Because it runs on every query, reranking adds cost and latency proportional to query volume.

    Why reranking exists, and when it becomes optional

    Reranking is fundamentally a correction step. It exists because first-stage retrieval over a noisy index returns candidates in imperfect order, so a second model re-sorts them. If the index is clean from the start, the first pass already returns well-ordered, relevant results, and the correction step becomes optional. Green Vectors eliminates redundant vectors at ingestion, so the index is not polluted with near-duplicate noise and first-pass relevance is high. For most production workloads this removes the dependency on a separate reranking stage. Ultra-high-precision applications may still layer reranking on top.

    What reranking can and cannot do

    Reranking can fix the order of retrieved results. It cannot improve recall, meaning it cannot surface a relevant document that first-stage retrieval failed to return. Improving recall requires a better index, not a reranker.

    More questions

    No. Reranking reorders documents already retrieved. Recovering missed documents requires improving the index or first-stage retrieval, not reranking.
    Yes, when first-pass retrieval is already high quality. Reranking compensates for noisy retrieval. With a clean index from ingestion, as Green Vectors produces, first-pass relevance is high enough that reranking is optional for most workloads.
    The case is architectural. Reranking corrects for noise that a clean index does not produce in the first place. The Elastic BBQ benchmark showed Green Vectors achieving a .9658 relevancy score, indicating strong first-pass retrieval quality. The most reliable test is to evaluate first-pass results on your own workload.

    Ready to go deeper?

    Request access to Kitana, our Python SDK built on Green Vectors, or get in touch with the team.