Proceedings of the 8th Working Conference on Mining Software Repositories 2011
DOI: 10.1145/1985441.1985451
|View full text |Cite
|
Sign up to set email alerts
|

Retrieval from software libraries for bug localization

Abstract: From the standpoint of retrieval from large software libraries for the purpose of bug localization, we compare five generic text models and certain composite variations thereof. The generic models are: the Unigram Model (UM), the Vector Space Model (VSM), the Latent Semantic Analysis Model (LSA), the Latent Dirichlet Allocation Model (LDA), and the Cluster Based Document Model (CBDM). The task is to locate the files that are relevant to a bug reported in the form of a textual description by a software develope… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
26
0

Year Published

2013
2013
2023
2023

Publication Types

Select...
6
2

Relationship

2
6

Authors

Journals

citations
Cited by 199 publications
(26 citation statements)
references
References 26 publications
0
26
0
Order By: Relevance
“…We could not find a bug dataset for C# projects, like iBUGS (Dallmeier and Zimmermann 2016) or moreBugs (Rao and Kak 2013b). Then, we used GitHub search functionality 2 to obtain a list of large C# projects, by searching for projects with 1000 or more stars and 100 or more forks.…”
Section: Project Selectionmentioning
confidence: 99%
See 1 more Smart Citation
“…We could not find a bug dataset for C# projects, like iBUGS (Dallmeier and Zimmermann 2016) or moreBugs (Rao and Kak 2013b). Then, we used GitHub search functionality 2 to obtain a list of large C# projects, by searching for projects with 1000 or more stars and 100 or more forks.…”
Section: Project Selectionmentioning
confidence: 99%
“…To foster the process of effectively identifying source code that is relevant to a particular bug report, a number of techniques have been developed using information retrieval (IR) models such as Latent Dirichlet Allocation (LDA) (Lukins et al 2010), Latent Semantic Analysis (LSA) (Rao and Kak 2011), and Vector Space Model (VSM) (Zhou et al 2012). The IR approach to bug localization generally consists of treating source files as documents, against which a query, represented by the bug report, will be run.…”
Section: Introductionmentioning
confidence: 99%
“…Similar to our previous work [10], we have used the moreBugs [23] dataset to perform our experimental validation. The dataset contains all the necessary information to evaluate both the batch-mode and the incremental-mode approaches to IR based bug localization, namely: (a) the commit-level changes taking place in the repository; (b) the release history of the software; and (c) a set of closed/resolved issues/bugs.…”
Section: Experimental Validation 41 the Datasetmentioning
confidence: 99%
“…We also present strategies for retraining the model after a sequence of commits or for large commits (commits that affect a significant portion of the source code) in order to keep the incrementally updated model close to the true model. In order to evaluate our incremental model update framework, we have created a benchmark dataset called moreBugs [23] that tracks commit-level changes over 10 years of developmental history of two software repositories: JodaTime and AspectJ. …”
Section: Introductionmentioning
confidence: 99%
“…We have therefore created a new and publicly available benchmark dataset called moreBugs [22] by mining ten years of commit history for AspectJ and JodaTime projects. …”
Section: Experimental Validation a The Evaluation Datasetmentioning
confidence: 99%