2021
DOI: 10.48550/arxiv.2112.01810
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Siamese BERT-based Model for Web Search Relevance Ranking Evaluated on a New Czech Dataset

Abstract: Web search engines focus on serving highly relevant results within hundreds of milliseconds. Pre-trained language transformer models such as BERT are therefore hard to use in this scenario due to their high computational demands. We present our real-time approach to the document ranking problem leveraging a BERT-based siamese architecture. The model is already deployed in a commercial search engine and it improves production performance by more than 3%. For further research and evaluation, we release DaReCzech… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(3 citation statements)
references
References 14 publications
0
2
0
Order By: Relevance
“…For our experiments on English, we use the pre-trained ELECTRA-small model introduced by Clark et al (2020), which has 14M parameters. For Czech, we employ the pre-trained monolingual model Small-E-Czech (Kocián et al, 2021) with the same size and architecture. Firstly, we train separate models for both tasks (ABSA and SRL) and select the optimal set of hyper-parameters on the development data.…”
Section: Datasets and Models Fine-tuningmentioning
confidence: 99%
“…For our experiments on English, we use the pre-trained ELECTRA-small model introduced by Clark et al (2020), which has 14M parameters. For Czech, we employ the pre-trained monolingual model Small-E-Czech (Kocián et al, 2021) with the same size and architecture. Firstly, we train separate models for both tasks (ABSA and SRL) and select the optimal set of hyper-parameters on the development data.…”
Section: Datasets and Models Fine-tuningmentioning
confidence: 99%
“…Czech Electra model (Kocián et al, 2021), two multilingual models mBERT (Devlin et al, 2019), XLM-R (Conneau et al, 2020) and the original monolingual English BERT model (Devlin et al, 2019), see We fine-tune all the models for the binary classification task, i.e., subjective vs. objective sentence detection. For all models based on the original BERT model, we use the hidden vector h ∈ R H of the classification token [CLS] that represents the entire input sequence, where H is the hidden size of the model.…”
Section: Transformer Modelsmentioning
confidence: 99%
“…Czech Electra(Kocián et al, 2021) is Czech model based on the Electra-small model(Clark et al, 2020).Czert-B(Sido et al, 2021) is Czech variant of the original BERT BASE model(Devlin et al, 2019).RobeCzech(Straka et al, 2021) is Czech version of the RoBERTa model(Liu et al, 2019).BERT(Devlin et al, 2019) is the original BERT BASE model.mBERT(Devlin et al, 2019) is a cased multilingual version of the BERT BASE that was jointly trained on 104 languages.XLM-R-Large(Conneau et al, 2020) is a multilingual version of the RoBERTa(Liu et al, 2019) that supports 100 languages.…”
mentioning
confidence: 99%