Proceedings of the 20th International Conference on Evaluation and Assessment in Software Engineering 2016
DOI: 10.1145/2915970.2916007
|View full text |Cite
|
Sign up to set email alerts
|

The jinx on the NASA software defect data sets

Abstract: Background: The NASA datasets have previously been used extensively in studies of software defects. In 2013 Shepperd et al. presented an essential set of rules for removing erroneous data from the NASA datasets making this data more reliable to use. Objective: We have now found additional rules necessary for removing problematic data which were not identified by Shepperd et al. Results: In this paper, we demonstrate the level of erroneous data still present even after cleaning using Shepperd et al.'s rules and… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
26
0

Year Published

2016
2016
2024
2024

Publication Types

Select...
4
3
2

Relationship

3
6

Authors

Journals

citations
Cited by 45 publications
(26 citation statements)
references
References 9 publications
0
26
0
Order By: Relevance
“…Shepperd et al [64] raise concerns related to data quality in the NASA datasets. Furthermore, Petrić et al [59] show that problematic data remain in the cleaned NASA datasets. Thus, the quality of the NASA datasets is questionable.…”
Section: Studied Datasetsmentioning
confidence: 99%
“…Shepperd et al [64] raise concerns related to data quality in the NASA datasets. Furthermore, Petrić et al [59] show that problematic data remain in the cleaned NASA datasets. Thus, the quality of the NASA datasets is questionable.…”
Section: Studied Datasetsmentioning
confidence: 99%
“…Ghotra et al (Rep [8]) did 2 replication runs of Lessmann et al (Org [5]). The first run was based on uncleaned NASA data (including duplicate and inconsistent instances, see [25]) to confirm if no single classifier is best as in the original (Org [5]). The Friedman test "We used the Scott-Knott test to overcome the confounding issue of overlapping groups that are produced by several other post hoc tests, such as Nemenyis test [13], which was used by the original study.…”
Section: Rq4: Do Original and Replication Studies In Defect Predictiomentioning
confidence: 99%
“…The curated data by Shepperd et al [26] has been cleaned further by Petrić et al [25]. The data errors found during this further cleaning may have also affected previous models.…”
Section: Rq4: Do Original and Replication Studies In Defect Predictiomentioning
confidence: 99%
“…• Lines of code should be less than the length of the file [31] (though cumulative code changes or code churn may exceed the length of the file);…”
Section: Evaluating Quality Of Datamentioning
confidence: 99%