Proceedings of Second Workshop for NLP Open Source Software (NLP-OSS) 2020
DOI: 10.18653/v1/2020.nlposs-1.19
|View full text |Cite
|
Sign up to set email alerts
|

TOMODAPI: A Topic Modeling API to Train, Use and Compare Topic Models

Abstract: From LDA to neural models, different topic modeling approaches have been proposed in the literature. However, their suitability and performance is not easy to compare, particularly when the algorithms are being used in the wild on heterogeneous datasets. In this paper, we introduce ToModAPI (TOpic MOdeling API), a wrapper library to easily train, evaluate and infer using different topic modeling algorithms through a unified interface. The library is extensible and can be used in Python environments or through … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
13
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
2
1

Relationship

1
6

Authors

Journals

citations
Cited by 12 publications
(13 citation statements)
references
References 22 publications
0
13
0
Order By: Relevance
“…We measured the intrinsic coherence of each model using normalised pointwise mutual information (NPMI) (Eq. ( 2)) (Lisena et al 2020), which traditionally has a strong correlation with human ratings (Röder et al 2015). This was done using the TOMODAPI (Lisena et al 2020), which applies Eq.…”
Section: Evaluating Convergence Patternsmentioning
confidence: 99%
See 1 more Smart Citation
“…We measured the intrinsic coherence of each model using normalised pointwise mutual information (NPMI) (Eq. ( 2)) (Lisena et al 2020), which traditionally has a strong correlation with human ratings (Röder et al 2015). This was done using the TOMODAPI (Lisena et al 2020), which applies Eq.…”
Section: Evaluating Convergence Patternsmentioning
confidence: 99%
“…( 2)) (Lisena et al 2020), which traditionally has a strong correlation with human ratings (Röder et al 2015). This was done using the TOMODAPI (Lisena et al 2020), which applies Eq. ( 2) to couples of words, computing their joint probabilities.…”
Section: Evaluating Convergence Patternsmentioning
confidence: 99%
“…ToModAPI (Lisena et al, 2020) is a python API that allows for training, inference, and evaluating different topic models, also including some of the most recent. However, it does not provide a method for finding the best hyper-parameter configuration of topic models.…”
Section: Existing Frameworkmentioning
confidence: 99%
“…Current topic modeling frameworks (McCallum et al, 2005;Qiang et al, 2018;Lisena et al, 2020) typically focus on the release of topic modeling algorithms while ignoring one or more critical aspects of the topic modeling pipeline, such as preprocessing, evaluation, comparison of the models, and visualization. Most importantly, they disregard the hyper-parameter selection.…”
Section: Introductionmentioning
confidence: 99%
“…The full set of parameters is documented in the repository 7 ; • For each trained model, we compute all the intrinsic (coherence) metrics and the groundtruth-based ones. For the experiment, we rely on To-ModAPI (Lisena et al, 2020), an open-source topic modelling API that is built to easily train, evaluate and compare several topic models. This framework provides a common interface for training, performing topic inference, and evaluating using coherence and ground truth.…”
Section: Varying the Datasetsmentioning
confidence: 99%