2022
DOI: 10.1007/978-3-030-99736-6_2
|View full text |Cite
|
Sign up to set email alerts
|

PARM: A Paragraph Aggregation Retrieval Model for Dense Document-to-Document Retrieval

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
11
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
2
1

Relationship

1
7

Authors

Journals

citations
Cited by 15 publications
(11 citation statements)
references
References 32 publications
0
11
0
Order By: Relevance
“…Earlier techniques for legal information retrieval were mainly based on term-matching approaches (Kim and Goebel, 2017;Tran et al, 2018). Recently, a growing number of works have used neural networks to enhance retrieval performance, including word embedding models (Landthaler et al, 2016), doc2vec models (Sugathadasa et al, 2018), CNN-based models (Tran et al, 2019), and BERTbased models (Nguyen et al, 2021;Chalkidis et al, 2021;Althammer et al, 2022). To the best of our knowledge, we are the first to exploit the structure of statute law with GNNs to improve the performance of dense retrieval models.…”
Section: Related Workmentioning
confidence: 99%
“…Earlier techniques for legal information retrieval were mainly based on term-matching approaches (Kim and Goebel, 2017;Tran et al, 2018). Recently, a growing number of works have used neural networks to enhance retrieval performance, including word embedding models (Landthaler et al, 2016), doc2vec models (Sugathadasa et al, 2018), CNN-based models (Tran et al, 2019), and BERTbased models (Nguyen et al, 2021;Chalkidis et al, 2021;Althammer et al, 2022). To the best of our knowledge, we are the first to exploit the structure of statute law with GNNs to improve the performance of dense retrieval models.…”
Section: Related Workmentioning
confidence: 99%
“…In the context of document-to-document retrieval where the "query" can be extremely long, Tran et al [95] first produce a summary that is further paired with lexical features in order to retrieve cases. Both PARM [96] and BERT-PLI [28] also tried to condense the documents, and performed paragraph-level modeling on top of candidates returned via BM25 or similar methods. Instead of focusing on the techniques, Shao et al [85] presented a comparative user behavior study between legal and general-domain search, and suggested that legal information retrieval is more challenging with respect to query length and number of clicks/pages, among other metrics.…”
Section: A An Overview Of Major Legal Nlp Tasksmentioning
confidence: 99%
“…For the pool creation we use the runs from Hofstätter et al [13]. In order to have different first stage retrieval methods we use the lexical retrieval run with BM25 [24] (run 1 in Table 2) as well as the SciBERT 𝐷𝑂𝑇 run (run 2 in Table 2) which is based on dense retrieval [3,15]. As additional run we use the Ensemble which reranks BM25 Top-200 candidates using an Ensemble of BERT 𝐶𝐴𝑇 based on SciBERT, PubMedBERT-Abstract and PubMedBert-Full Text (run 7 in Table 2).…”
Section: Data and Pool Preparationmentioning
confidence: 99%