SEQUENCER: Sequence-to-Sequence Learning for End-to-End Program Repair

Chen, Zimin; Kommrusch, Steve; Tufano, Michele; Pouchet, Louis-Noël; Poshyvanyk, Denys; Monperrus, Martin

doi:10.1109/tse.2019.2940179

Cited by 233 publications

(377 citation statements)

References 52 publications

(90 reference statements)

Supporting

Mentioning

375

Contrasting

Order By: Relevance

“…Our data generation tools along with documentation and detailed instructions for how to use them are available in a public GitHub repository 2 and the dataset is publicly available in Zenodo. 3…”

Section: Methodsmentioning

confidence: 99%

“…The combined datasets are the CodRep dataset [4] and the Bugs2Fix dataset [26] resulting in 40,289 one-line bugs. These datasets are combined into a single dataset of one line bugs in [3]. Our datasets are of similar size consisting of 25,539 and 153,652 single-statement bugs.…”

Section: Related Workmentioning

confidence: 99%

“…Because of this lack of data, it has not previously been possible to estimate the recall of a set of repair templates, that is, the percentage of real-world bugs that can be repaired by one of the templates. Simultaneously to the current work, a larger dataset of one-line bugs has been mined [3], but even this dataset does not attempt to classify bugs into templates.…”

mentioning

confidence: 99%

See 2 more Smart Citations

How Often Do Single-Statement Bugs Occur?

Karampatsis

Sutton

2020

Proceedings of the 17th International Conference on Mining Software Repositories

107

View full text Add to dashboard Cite

Program repair is an important but difficult software engineering problem. One way to achieve acceptable performance is to focus on classes of simple bugs, such as bugs with single statement fixes, or that match a small set of bug templates. However, it is very difficult to estimate the recall of repair techniques for simple bugs, as there are no datasets about how often the associated bugs occur in code. To fill this gap, we provide a dataset of 153,652 single statement bugfix changes mined from 1,000 popular open-source Java projects, annotated by whether they match any of a set of 16 bug templates, inspired by state-of-the-art program repair techniques. In an initial analysis, we find that about 33% of the simple bug fixes match the templates, indicating that a remarkable number of single-statement bugs can be repaired with a relatively small set of templates. Further, we find that template fitting bugs appear with a frequency of about one bug per 1,600-2,500 lines of code (as measured by the size of the project's latest version). We hope that the dataset will prove a resource for both future work in program repair and studies in empirical software engineering. CCS CONCEPTS • Software and its engineering → Software testing and debugging.

show abstract

Section: Methodsmentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

mentioning

confidence: 99%

See 1 more Smart Citation

How Often Do Single-Statement Bugs Occur?

Karampatsis

Sutton

2020

Proceedings of the 17th International Conference on Mining Software Repositories

107

View full text Add to dashboard Cite

show abstract

“…Many works have taken advantage of the "naturalness" of software [44] to assist software engineering tasks, including code completion [76], improving code readability [2], program repair [20,78], identifying buggy code [75] and API migration [38], among many others [4]. These approaches analyze large amounts of source code, ranging from hundreds to thousands of software projects, building machine learning models of source code properties, inspired by techniques from natural language processing (NLP).…”

Section: Introductionmentioning

confidence: 99%

Big code != big vocabulary

Karampatsis

Babii

Robbes

et al. 2020

Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering

138

View full text Add to dashboard Cite

Statistical language modeling techniques have successfully been applied to large source code corpora, yielding a variety of new software development tools, such as tools for code suggestion, improving readability, and API migration. A major issue with these techniques is that code introduces new vocabulary at a far higher rate than natural language, as new identifier names proliferate. Both large vocabularies and out-of-vocabulary issues severely affect Neural Language Models (NLMs) of source code, degrading their performance and rendering them unable to scale. In this paper, we address this issue by: 1) studying how various modelling choices impact the resulting vocabulary on a large-scale corpus of 13,362 projects; 2) presenting an open vocabulary source code NLM that can scale to such a corpus, 100 times larger than in previous work; and 3) showing that such models outperform the state of the art on three distinct code corpora (Java, C, Python). To our knowledge, these are the largest NLMs for code that have been reported. All datasets, code, and trained models used in this work are publicly available. CCS CONCEPTS • Software and its engineering → Software maintenance tools.

show abstract

“…8 We categorize a defect type for the sampled code blocks based on how the defect was fixed in the bug-fixing commits. We use a taxonomy of Chen et al [10] which is summarized in Table 4. To ensure a consistent understanding of the taxonomy, the first four authors of this paper independently categorize defect types for the 30 hit and 30 missed defective blocks.…”

Section: (Rq4) What Kind Of Defects Can Be Identified By Our Line-dp?mentioning

confidence: 99%

Predicting Defective Lines Using a Model-Agnostic Technique

Wattanakriengkrai

Thongtanunam

Tantithamthavorn

et al. 2022

IIEEE Trans. Software Eng.

View full text Add to dashboard Cite

Defect prediction models are proposed to help a team prioritize source code areas files that need Software Quality Assurance (SQA) based on the likelihood of having defects. However, developers may waste their unnecessary effort on the whole file while only a small fraction of its source code lines are defective. Indeed, we find that as little as 1%-3% of lines of a file are defective. Hence, in this work, we propose a novel framework (called LINE-DP) to identify defective lines using a model-agnostic technique, i.e., an Explainable AI technique that provides information why the model makes such a prediction. Broadly speaking, our LINE-DP first builds a file-level defect model using code token features. Then, our LINE-DP uses a state-of-the-art model-agnostic technique (i.e., LIME) to identify risky tokens, i.e., code tokens that lead the file-level defect model to predict that the file will be defective. Then, the lines that contain risky tokens are predicted as defective lines. Through a case study of 32 releases of nine Java open source systems, our evaluation results show that our LINE-DP achieves an average recall of 0.61, a false alarm rate of 0.47, a top 20%LOC recall of 0.27, and an initial false alarm of 16, which are statistically better than six baseline approaches. Our evaluation shows that our LINE-DP requires an average computation time of 10 seconds including model construction and defective identification time. In addition, we find that 63% of defective lines that can be identified by our LINE-DP are related to common defects (e.g., argument change, condition change). These results suggest that our LINE-DP can effectively identify defective lines that contain common defects while requiring a smaller amount of inspection effort and a manageable computation cost. The contribution of this paper builds an important step towards line-level defect prediction by leveraging a model-agnostic technique.

show abstract

SEQUENCER: Sequence-to-Sequence Learning for End-to-End Program Repair

Cited by 233 publications

References 52 publications

How Often Do Single-Statement Bugs Occur?

How Often Do Single-Statement Bugs Occur?

Big code != big vocabulary

Predicting Defective Lines Using a Model-Agnostic Technique

Contact Info

Product

Resources

About