An Efficiency Study for SPLADE Models

Lassance, Carlos; Clinchant, Stéphane

doi:10.1145/3477495.3531833

Cited by 36 publications

(16 citation statements)

References 45 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…We confirmed this by replacing the shared encoder with two separate ones (distilSplade sep ), which reduced latency from 122.5 ms to 50.2 ms, a 59% decrease. This benefit of separate encoders was also reported in [14], and our results further support its substantial impact.…”

Section: Rq2: How Do Lsr Methods Perform With Recent Advanced Trainin...supporting

confidence: 88%

See 1 more Smart Citation

A Unified Framework for Learned Sparse Retrieval

Nguyen

MacAvaney

Yates

2023

Lecture Notes in Computer Science

View full text Add to dashboard Cite

Learned sparse retrieval (LSR) is a family of first-stage retrieval methods that are trained to generate sparse lexical representations of queries and documents for use with an inverted index. Many LSR methods have been recently introduced, with Splade models achieving state-of-the-art performance on MSMarco. Despite similarities in their model architectures, many LSR methods show substantial differences in effectiveness and efficiency. Differences in the experimental setups and configurations used make it difficult to compare the methods and derive insights. In this work, we analyze existing LSR methods and identify key components to establish an LSR framework that unifies all LSR methods under the same perspective. We then reproduce all prominent methods using a common codebase and re-train them in the same environment, which allows us to quantify how components of the framework affect effectiveness and efficiency. We find that (1) including document term weighting is most important for a method's effectiveness, (2) including query weighting has a small positive impact, and (3) document expansion and query expansion have a cancellation effect. As a result, we show how removing query expansion from a state-of-the-art model can reduce latency significantly while maintaining effectiveness on MSMarco and TripClick benchmarks. Our code is publicly available. 3

show abstract

Section: Rq2: How Do Lsr Methods Perform With Recent Advanced Trainin...supporting

confidence: 88%

“…In practice, distilSplade qM LP could be viewed as a more efficient drop-in replacement for the full model. This use of qM LP is complementary to other changes (e.g., using a smaller encoder as in [14]) to improve the efficiency of LSR.…”

Section: Rq3: How Does the Choice Of Encoder Architecture And Regular...mentioning

confidence: 99%

A Unified Framework for Learned Sparse Retrieval

Nguyen

MacAvaney

Yates

2023

Lecture Notes in Computer Science

View full text Add to dashboard Cite

show abstract

“…Efficient SPLADE model. Table 5 shows the application of 2GTI in a recently published efficient SPLADE model [20] which has made several improvements in retrieval speed. We have used the released checkpoint of this efficient model called BT-SPLADE-L, which has a weaker MRR@10 score, but significantly faster than our trained SPLADE baseline reported in Table 2.…”

Section: Evaluationsmentioning

confidence: 99%

Optimizing Guided Traversal for Fast Learned Sparse Retrieval

Qiao

Yang

Lin

et al. 2023

Proceedings of the ACM Web Conference 2023

View full text Add to dashboard Cite

Recent studies show that BM25-driven dynamic index skipping can greatly accelerate MaxScore-based document retrieval based on the learned sparse representation derived by DeepImpact. This paper investigates the effectiveness of such a traversal guidance strategy during top 𝑘 retrieval when using other models such as SPLADE and uniCOIL, and finds that unconstrained BM25-driven skipping could have a visible relevance degradation when the BM25 model is not well aligned with a learned weight model or when retrieval depth 𝑘 is small. This paper generalizes the previous work and optimizes the BM25 guided index traversal with a two-level pruning control scheme and model alignment for fast retrieval using a sparse representation. Although there can be a cost of increased latency, the proposed scheme is much faster than the original MaxScore method without BM25 guidance while retaining the relevance effectiveness. This paper analyzes the competitiveness of this two-level pruning scheme, and evaluates its tradeoff in ranking relevance and time efficiency when searching several test datasets.

show abstract

“…We follow the strategy we used on our latest TREC notebooks, in that we strive for making this more streamlined than a normal research paper would be. We will now present a list of the papers that better introduce and detail the models we used here and refer the reader to check them for a better explanation than those we have here, that are mainly dedicated to how to apply it to MIRACL and not to the methods themselves: i) Training non English SPLADE models [11], ii) The SPLADE model [5,10], iii) The Contriever model and its pretraining [8], iv) The RankT5 reranker [16], v) MonoT5 [13], vi) The LCE loss [6], vii) ColBERT [9], and viii) For our ensembling we use Ranx [1] and their min-max normalized sum ensembling.…”

Section: Introductionmentioning

confidence: 99%

“…This was mostly due to desperation when we saw everyone overtaking us on the dev set. These rerankers were then tested on known languages and they did not improve the results 10. https://huggingface.co/google/byt5-xl 11 https://huggingface.co/microsoft/mdeberta-v3-base…”

mentioning

confidence: 99%

Extending English IR methods to multi-lingual IR

Lassance¹

2023

Preprint

View full text Add to dashboard Cite

This paper describes our participation in the 2023 WSDM CUP -MIRACL challenge. Via a combination of i) document translation; ii) multilingual SPLADE and Contriever; and iii) multilingual RankT5 and many other models, we were able to get first place in both the known and surprise languages tracks. Our strategy mostly revolved around getting the most diverse runs for the first stage and then throwing all possible reranking techniques. While this was not a first for many techniques, we had some things that we believe were never tried before, for example, we train the first SPLADE model that is effectively capable of working in more than 10 languages. However, a more careful study of the results is needed in order to verify if we were able to get first place just due to brute force or if the hybrids we developed really brought improvements over the other team's solutions.

show abstract

An Efficiency Study for SPLADE Models

Cited by 36 publications

References 45 publications

A Unified Framework for Learned Sparse Retrieval

A Unified Framework for Learned Sparse Retrieval

Optimizing Guided Traversal for Fast Learned Sparse Retrieval

Extending English IR methods to multi-lingual IR

Contact Info

Product

Resources

About