2018
DOI: 10.7753/ijcatr0710.1003
|View full text |Cite
|
Sign up to set email alerts
|

Text Mining in Digital Libraries using OKAPI BM25 Model

Abstract: The emergence of the internet has made vast amounts of information available and easily accessible online. As a result, most libraries have digitized their content in order to remain relevant to their users and to keep pace with the advancement of the internet. However, these digital libraries have been criticized for using inefficient information retrieval models that do not perform relevance ranking to the retrieved results. This paper proposed the use of OKAPI BM25 model in text mining so as means of improv… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
2
0
1

Year Published

2021
2021
2023
2023

Publication Types

Select...
3
1
1

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(4 citation statements)
references
References 7 publications
(11 reference statements)
0
2
0
1
Order By: Relevance
“…The authors use the PubMed digital database for Information Retrieval. The performance of the BM25 model was compared with the Boolean model and the vector space model in [17]. The results showed that the documents best ranked as relevant were determined using the OkapiBM25 probability-based ranking algorithm.…”
Section: Related Workmentioning
confidence: 99%
“…The authors use the PubMed digital database for Information Retrieval. The performance of the BM25 model was compared with the Boolean model and the vector space model in [17]. The results showed that the documents best ranked as relevant were determined using the OkapiBM25 probability-based ranking algorithm.…”
Section: Related Workmentioning
confidence: 99%
“…BM25 merupakan metode untuk pemeringkatan yang dapat mencocokan pengurutan antara kata kunci dan dokumen koleksi data yang ada. Menurut penelitian Tinega,dkk 2018 menjelaskan bahwa pemeringkatan penggunaan BM25 mempunyai nilai yang jauh lebih baik jika dibandingkan dengan vector space model dan boolean model [8].…”
Section: Pendahuluanunclassified
“…Penelitian tersebut menggunakan data bug report sehingga dapat mendeteksi adanya duplikasi pada bug report dengan akurasi sebesar 90%. Selain itu penelitian (Tinega, et al, 2018) juga memperkuat bahwa pemeringkatan menggunakan BM25 menghasilkan nilai yang jauh lebih baik apabila dibandingankan Boolean Model dan Vector Space Model. Penelitian (Whissel & Clarke, 2013) juga menunjukkan apabila dibandingkan dengan cosinus similarity maka algoritma BM25 merupakan algoritma yang hasil pemeringkatannya lebih baik daripada menggunakan cosinus similarity Hasil akurasi dari penggabungan metode BM25 dengan KNN memiliki hasil akurasi terbaik yaitu 88,97%.…”
Section: Pendahuluanunclassified