2021
DOI: 10.3233/faia210326
|View full text |Cite
|
Sign up to set email alerts
|

An Information Retrieval Pipeline for Legislative Documents from the Brazilian Chamber of Deputies

Abstract: This work investigates information retrieval methods to address the existing difficulties on the Preliminary Search, part of the law making process from the Brazilian Chamber of Deputies. For such, different preprocessing approaches, stemmers, language models, and BM25 variants were compared. Two legislative corpora from Chamber were used to build and validate the pipeline. All texts were converted to lowercase and had stopwords, accentuation, and punctuation removed. Words were represented by their stem combi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

2
7
0
10

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
3
1

Relationship

1
6

Authors

Journals

citations
Cited by 9 publications
(19 citation statements)
references
References 16 publications
(20 reference statements)
2
7
0
10
Order By: Relevance
“…Souza et al [11,28] investigated IR methods and presented a pipeline for the retrieval of legislative documents within the context of the Brazilian Chamber of Deputies. Evaluating the use of three variants of the BM25 algorithm, along with different pre-processing techniques, they built the IR model currently employed by the Chamber to retrieve bills and other queries relevant to a parliamentarian's request.…”
Section: Legal Information Retrievalmentioning
confidence: 99%
See 3 more Smart Citations
“…Souza et al [11,28] investigated IR methods and presented a pipeline for the retrieval of legislative documents within the context of the Brazilian Chamber of Deputies. Evaluating the use of three variants of the BM25 algorithm, along with different pre-processing techniques, they built the IR model currently employed by the Chamber to retrieve bills and other queries relevant to a parliamentarian's request.…”
Section: Legal Information Retrievalmentioning
confidence: 99%
“…Nowadays, the IR model used by Conle to automatically retrieve relevant documents is based on BM25L [38] and a combination of pre-processing techniques: punctuation, accentuation, and stopwords removal + Stemming, with the Savoy algorithm [39], + unigram and bigram; as presented by [11]. BM25L ranks the documents by estimating their relevance to a query.…”
Section: The Scenario Of the Brazilian Chamber Of Deputiesmentioning
confidence: 99%
See 2 more Smart Citations
“…Information Retrieval techniques are at the core of many legal research platforms, such as Lexis+ 2 and Westlaw Edge 3 , and have several critical application scenarios, such as legislative document retrieval [82], [83] and case retrieval [84], [85]. Depending on the application, the queries can be short (e.g., simple keywords), of medium length (e.g., Boolean or natural language queries) or long (e.g., whole documents).…”
Section: A An Overview Of Major Legal Nlp Tasksmentioning
confidence: 99%