2019 IEEE 1st International Workshop on Intelligent Bug Fixing (IBF) 2019
DOI: 10.1109/ibf.2019.8665475
|View full text |Cite
|
Sign up to set email alerts
|

A Comprehensive Study of Automatic Program Repair on the QuixBugs Benchmark

Abstract: Automatic program repair papers tend to repeatedly use the same benchmarks. This poses a threat to the external validity of the findings of the program repair research community. In this paper, we perform an automatic repair experiment on a benchmark called QuixBugs that has never been studied in the context of program repair. In this study, we report on the characteristics of QuixBugs, and study five repair systems, Arja, Astor, Nopol, NPEfix and RSRepair, which are representatives of generate-and-validate re… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
25
0

Year Published

2019
2019
2021
2021

Publication Types

Select...
3
2
2

Relationship

2
5

Authors

Journals

citations
Cited by 35 publications
(25 citation statements)
references
References 36 publications
0
25
0
Order By: Relevance
“…For instance, in the paper that jGenProg [27] is presented, there is an evaluation on Defects4J: this evaluation has no citation in the second column of the table because the evaluation is in jGenProg's paper. Later, it was evaluated again on Defects4J [26] and also on QuixBugs [48], which contain citations of the empirical evaluation papers in the table. The table also presents additional information on the evaluations, which are the number of bugs given as input to the repair tools, and the number of bugs for which the tools generated a test-suite adequate patch (i.e.…”
Section: State Of Affairs On Test-suite-based Automatic Repair Tools mentioning
confidence: 99%
See 2 more Smart Citations
“…For instance, in the paper that jGenProg [27] is presented, there is an evaluation on Defects4J: this evaluation has no citation in the second column of the table because the evaluation is in jGenProg's paper. Later, it was evaluated again on Defects4J [26] and also on QuixBugs [48], which contain citations of the empirical evaluation papers in the table. The table also presents additional information on the evaluations, which are the number of bugs given as input to the repair tools, and the number of bugs for which the tools generated a test-suite adequate patch (i.e.…”
Section: State Of Affairs On Test-suite-based Automatic Repair Tools mentioning
confidence: 99%
“…They also found that a small number of bugs (9/47) could be repaired with a test-suite adequate patch that is also correct. Ye et al [48] presented a study where nine repair tools were executed on the bugs from QuixBugs. They used automatically generated test cases based on the human-written patches to identify incorrect patches generated by the repair tools.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…The other aspect is test adequacy of the buggy class. We use line coverage and branch coverage to measure it as existing studies do [49,50]. Our intuition is that the test quality measured by line and branch coverages is related with the type of correct patches generated for this bug since existing studies have shown that the correctness (i.e., plausible, overfitting, or correct) of APR generated patches has strong correlation with the test quality [32,34,36].…”
Section: Research Questionsmentioning
confidence: 99%
“…The data of patch complexity is from the previous study [48] in which the characteristics of each bug in Defects4J have been analyzed. The data of test adequacy is calculated by Cobertura 1 which is a free Java tool being widely-used in recent studies [49,50]. If different types of patches are generated for the same bug, the data of this bug is added into all the relevant types for analysis.…”
Section: Bug Characteristicsmentioning
confidence: 99%