2013 10th Working Conference on Mining Software Repositories (MSR) 2013
DOI: 10.1109/msr.2013.6624028
|View full text |Cite
|
Sign up to set email alerts
|

The Eclipse and Mozilla defect tracking dataset: A genuine dataset for mining bug information

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
32
0
3

Year Published

2014
2014
2023
2023

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 63 publications
(35 citation statements)
references
References 9 publications
0
32
0
3
Order By: Relevance
“…This dataset includes all the bug reports and their histories from their inception to May 2014. For Java projects, we have used Lamkanfi et al's [16] bug dataset to extract the bug information. The Java dataset includes all the bug reports and their histories from their inception to March 2011 for these four projects (extracted from Eclipse Bugzilla database).…”
Section: Subject Systemsmentioning
confidence: 99%
See 1 more Smart Citation
“…This dataset includes all the bug reports and their histories from their inception to May 2014. For Java projects, we have used Lamkanfi et al's [16] bug dataset to extract the bug information. The Java dataset includes all the bug reports and their histories from their inception to March 2011 for these four projects (extracted from Eclipse Bugzilla database).…”
Section: Subject Systemsmentioning
confidence: 99%
“…Then using JGit APIs, we extracted all the commit messages from the histories and searched all numbers. 13,14,15,16 Then we matched each number with the bug IDs. To further ensure that those are indeed bug IDs, we only accepted those commits that contain additional information.…”
Section: Identification Of Faulty Source Codementioning
confidence: 99%
“…Challenges in Adoption of Research: When authors were designing the solution, it was not clear whether we need to adopt techniques which are claimed to be achieving best performance on open source data sets( [11]) for the problem of duplicate defect identification. Interestingly most of the published works with high scores in detecting duplicate bugs apply supervised machine learning techniques (including deep learning), which though work well on large open source data sets, however, can't be easily adopted in practice without incurring significant cost to create desired training data in form of semantically equivalent pairs of bugs by the experienced SMEs before tool could be adopted in practice.…”
Section: Relevant Results Relevant + Related Results Topmentioning
confidence: 99%
“…We have used Lamkanfi et al's [21] bug dataset, obtained from the Eclipse Bugzilla database, 4 to obtain the bug information associated with these projects. This dataset includes all the bug reports and their histories from the project inception to March 2011 for these four projects.…”
Section: B Subject Systemsmentioning
confidence: 99%