DOI: 10.1007/978-3-540-74958-5_5
|View full text |Cite
|
Sign up to set email alerts
|

Statistical Debugging Using Latent Topic Models

Abstract: Abstract. Statistical debugging uses machine learning to model program failures and help identify root causes of bugs. We approach this task using a novel Delta-Latent-Dirichlet-Allocation model. We model execution traces attributed to failed runs of a program as being generated by two types of latent topics: normal usage topics and bug topics. Execution traces attributed to successful runs of the same program, however, are modeled by usage topics only. Joint modeling of both kinds of traces allows us to ident… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
60
0

Publication Types

Select...
4
2

Relationship

1
5

Authors

Journals

citations
Cited by 63 publications
(60 citation statements)
references
References 20 publications
0
60
0
Order By: Relevance
“…First, topic modeling using LDA has been applied to solve a wide range of problems in software engineering such as: statistical debugging (Andrzejewski et al 2007), mining business topics (Maskeri et al 2008), mining author-topic models (Linstead et al 2007b), software traceability (Asuncion et al 2010), software categorization (Tian et al 2009), bug localization (Lukins et al 2008) etc. In an earlier work we used LDA topic modeling to mine topics from large corpus of source code, and showed that topics that emerge often resemble widely known aspects or concerns in source code (Baldi et al 2008).…”
Section: Topic Modelingmentioning
confidence: 99%
“…First, topic modeling using LDA has been applied to solve a wide range of problems in software engineering such as: statistical debugging (Andrzejewski et al 2007), mining business topics (Maskeri et al 2008), mining author-topic models (Linstead et al 2007b), software traceability (Asuncion et al 2010), software categorization (Tian et al 2009), bug localization (Lukins et al 2008) etc. In an earlier work we used LDA topic modeling to mine topics from large corpus of source code, and showed that topics that emerge often resemble widely known aspects or concerns in source code (Baldi et al 2008).…”
Section: Topic Modelingmentioning
confidence: 99%
“…Our algorithm is able to learn joint models of both typical and rare behaviours even if they co-exist. MC-∆LDA is a generalisation and completion of the ∆LDA model proposed in [12]. ∆LDA was used for understanding code bugs in computer programs, but without an inference framework for the labels of unseen documents, ∆LDA cannot be used for classification.…”
Section: Related Workmentioning
confidence: 99%
“…We follow the sitesand-predicates approach commonly used in prior work [1,17,[26][27][28]45]. An instrumentation site is a single program location at which the state of the running program will be inspected.…”
Section: Terminologymentioning
confidence: 99%
“…Statistical debugging techniques monitor run-time behavior to identify causes of crashes in end-user executions. Lightweight instrumentation [26] allows non-intrusive post-deployment monitoring, while statistical models [1,17,18,[26][27][28]45] identify profiled events that strongly predict crashes or other failures. Yet most programs mostly work: nearly all code in any given application is not relevant for any given bug.…”
Section: Introductionmentioning
confidence: 99%