Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 1 2010
DOI: 10.1145/1806799.1806817
|View full text |Cite
|
Sign up to set email alerts
|

Software traceability with topic modeling

Abstract: Software traceability is a fundamentally important task in software engineering. The need for automated traceability increases as projects become more complex and as the number of artifacts increases. We propose an automated technique that combines traceability with a machine learning technique known as topic modeling. Our approach automatically records traceability links during the software development process and learns a probabilistic topic model over artifacts. The learned model allows for the semantic cat… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
157
0
1

Year Published

2010
2010
2018
2018

Publication Types

Select...
6
3

Relationship

0
9

Authors

Journals

citations
Cited by 261 publications
(158 citation statements)
references
References 39 publications
0
157
0
1
Order By: Relevance
“…First, topic modeling using LDA has been applied to solve a wide range of problems in software engineering such as: statistical debugging (Andrzejewski et al 2007), mining business topics (Maskeri et al 2008), mining author-topic models (Linstead et al 2007b), software traceability (Asuncion et al 2010), software categorization (Tian et al 2009), bug localization (Lukins et al 2008) etc. In an earlier work we used LDA topic modeling to mine topics from large corpus of source code, and showed that topics that emerge often resemble widely known aspects or concerns in source code (Baldi et al 2008).…”
Section: Topic Modelingmentioning
confidence: 99%
“…First, topic modeling using LDA has been applied to solve a wide range of problems in software engineering such as: statistical debugging (Andrzejewski et al 2007), mining business topics (Maskeri et al 2008), mining author-topic models (Linstead et al 2007b), software traceability (Asuncion et al 2010), software categorization (Tian et al 2009), bug localization (Lukins et al 2008) etc. In an earlier work we used LDA topic modeling to mine topics from large corpus of source code, and showed that topics that emerge often resemble widely known aspects or concerns in source code (Baldi et al 2008).…”
Section: Topic Modelingmentioning
confidence: 99%
“…The Value Iteration algorithm is an iterative backup operation. The algorithm combines an immediate policy improvement for the current state and the values of states reachable from the current state in the following form: (4) where P a ss and R a ss` bear the same meaning as defined in Equation 2. The value of state s is maximized across all actions a available at s. The pseudo code for the Value Iteration algorithm is shown below:…”
Section:  mentioning
confidence: 99%
“…In this section we provide details behind LDA followed by how RTM extends this model to capture links among documents. While LDA has been previously applied in the context of software engineering for measuring conceptual cohesion of classes [27], recovering traceability links [3,31], mining software repositories [4,30,36] and bug location [28], RTM has not been utilized for software measurement tasks before.…”
Section: Using Relational Topic Models For Coupling Measurement mentioning
confidence: 99%
“…The description of the study follows the Goal-Question-Metrics paradigm outlined by Basili et al [5]. All the data used and generated in this section has been posted online to ensure reproducibility of our results 3 .…”
Section: Case Studymentioning
confidence: 99%