2021
DOI: 10.1007/s10664-021-10026-0
|View full text |Cite
|
Sign up to set email alerts
|

Topic modeling in software engineering research

Abstract: Topic modeling using models such as Latent Dirichlet Allocation (LDA) is a text mining technique to extract human-readable semantic “topics” (i.e., word clusters) from a corpus of textual documents. In software engineering, topic modeling has been used to analyze textual data in empirical studies (e.g., to find out what developers talk about online), but also to build new techniques to support software engineering tasks (e.g., to support source code comprehension). Topic modeling needs to be applied carefully … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
53
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 59 publications
(53 citation statements)
references
References 171 publications
(383 reference statements)
0
53
0
Order By: Relevance
“…First, determine the gain over all (K) value use Equation 2 [19]. Then, determine of the time constant (τ) value by Equation 3 [20], [21] and Equation 4.…”
Section: System Planningmentioning
confidence: 99%
“…First, determine the gain over all (K) value use Equation 2 [19]. Then, determine of the time constant (τ) value by Equation 3 [20], [21] and Equation 4.…”
Section: System Planningmentioning
confidence: 99%
“…Suominen et al [12] uses LDA to create topic-based linkages between publications and patents based on the semantic content in the documents. In empirical investigations [13], topic modeling has analyzed textual data. Between 2009 and 2020, the study analyzed subject modeling in 111 publications from the top ten ranked software engineering journals.…”
Section: Introductionmentioning
confidence: 99%
“…To meet such requirements, OrganicRef allows the use of multiple context-selection strategies. It also relies on Topic Modeling (Silva, Galster and Gilson 2021) for iden-tifying the features implemented by code elements. Then, OrganicRef combines feature-driven and rule-based heuristics to generate refactoring recommendations for a delimited context.…”
Section: Problem Statement and Research Questionsmentioning
confidence: 99%
“…The DPs are detected through information extracted from the project's design and source code. From the project's design, OrganicRef uses a topic modeling (Silva, Galster and Gilson 2021) algorithm to extract the distribution of features across the project's elements. Then, it relies on the source code to collect quality measures and code smells.…”
Section: Context-sensitive Detectionmentioning
confidence: 99%
See 1 more Smart Citation