Papers
arxiv:2411.11767

Drowning in Documents: Consequences of Scaling Reranker Inference

Published on Nov 18
ยท Submitted by mrdrozdov on Nov 19

Abstract

Rerankers, typically cross-encoders, are often used to re-score the documents retrieved by cheaper initial IR systems. This is because, though expensive, rerankers are assumed to be more effective. We challenge this assumption by measuring reranker performance for full retrieval, not just re-scoring first-stage retrieval. Our experiments reveal a surprising trend: the best existing rerankers provide diminishing returns when scoring progressively more documents and actually degrade quality beyond a certain limit. In fact, in this setting, rerankers can frequently assign high scores to documents with no lexical or semantic overlap with the query. We hope that our findings will spur future research to improve reranking.

Community

Paper author Paper submitter

Rerankers (cross-encoders) and retrievers (embeddings) are often derived from the same architecture, yet rerankers are assumed to be more accurate given that they jointly encode the query and document rather than process them independently. In this work, we find two surprising results with respect to this intuition: 1. reranking helps at first, but eventually reranking too many documents leads to a decrease in quality, and 2. in a fair match up between rerankers and retrievers where we rerank the full dataset, rerankers are less accurate than retrievers. In our paper we detail extensive experiments across both academic and enterprise datasets, and include results that suggest listwise reranking with LLMs are more robust than cross-encoders when scaling inference via reranking.

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

Extremely interesting, nice work ๐Ÿ‘

We maintain some domain-specific hybrid search systems and this paper has shown us we need to look at optimizing the top-k in our cross-encoder phase. Interesting work - I'm a little disappointed more CE models weren't used (mixed-bread).

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2411.11767 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2411.11767 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2411.11767 in a Space README.md to link it from this page.

Collections including this paper 2