2020
DOI: 10.11591/ijeecs.v19.i1.pp353-362
|View full text |Cite
|
Sign up to set email alerts
|

LSA & LDA topic modeling classification: comparison study on e-books

Abstract: <p>With the rapid growth of information technology, the amount of unstructured text data in digital libraries is rapidly increased and has become a big challenge in analyzing, organizing and how to classify text automatically in E-research repository to get the benefit from them is the cornerstone. The manual categorization of text documents requires a lot of financial, human resources for management. In order to get so, topic modeling are used to classify documents. This paper addresses a comparison stu… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
18
0
4

Year Published

2021
2021
2022
2022

Publication Types

Select...
5
5

Relationship

0
10

Authors

Journals

citations
Cited by 44 publications
(22 citation statements)
references
References 15 publications
0
18
0
4
Order By: Relevance
“…Latent Dirichlet Allocation (LDA) is a generative statistical model used to extract the latent topic structure of text documents. LDA is a machine-learning technique used in different areas such as retrieval field, document classification and topic modelling [ 31 ].…”
Section: Methodsmentioning
confidence: 99%
“…Latent Dirichlet Allocation (LDA) is a generative statistical model used to extract the latent topic structure of text documents. LDA is a machine-learning technique used in different areas such as retrieval field, document classification and topic modelling [ 31 ].…”
Section: Methodsmentioning
confidence: 99%
“…Dan setiap topik laten merupakan character atau kata yang didistribusikan dan mewakili keseluruhan kata dalam dokumen. Topik laten dapat dihasilkan dari kumpulan kata pada dokumen dengan nilai proporsi yang berbeda [12]. Data yang ada terlebih dahulu dibuat menjadi corpus atau Dictionary dengan memanfaatkan modul library Gensim pada Python, yaitu gensim.corpora.…”
Section: Membangun Model Ldaunclassified
“…Pada penelitian ini digunakan Latent Semantic Indexing (LSI), Latent Dirichlet Allocation (LDA), dan Hierarchical Dirichlet Process (HDP) sebagai metode pemodelan topik. LSI atau disebut juga Latent Semantic Analysis (LSA) merupakan metode ekstraksi dan representasi kata dari teks dengan menggunakan perhitungan statis pada dokumen atau data berukuran besar (Mohammed & Al-augby, 2020). LSA merupakan metode yang cepat dan paling mudah digunakan diantara metode pemodelan topik (Qomariyah et al, 2019).…”
Section: Pendahuluanunclassified