2021
DOI: 10.48550/arxiv.2110.14081
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

A Controlled Experiment of Different Code Representations for Learning-Based Bug Repair

Abstract: Training a deep learning model on source code has gained significant traction recently. Since such models reason about vectors of numbers, source code needs to be converted to a code representation and then will be transformed into vectors. Numerous approaches have been proposed to represent source code, from sequences of tokens to abstract syntax trees. However, there is no systematic study to understand the effect of code representation on learning performance. Through a controlled experiment, we examine the… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(1 citation statement)
references
References 51 publications
(86 reference statements)
0
1
0
Order By: Relevance
“…Automated bug-fixing techniques based on DL can rely on different levels of code abstraction. Word tokenization is a commonly used one, even if higher-level abstractions (e.g., AST-based) allow to achieve better results [51].…”
Section: Automatic Bug-fixingmentioning
confidence: 99%
“…Automated bug-fixing techniques based on DL can rely on different levels of code abstraction. Word tokenization is a commonly used one, even if higher-level abstractions (e.g., AST-based) allow to achieve better results [51].…”
Section: Automatic Bug-fixingmentioning
confidence: 99%