2020
DOI: 10.1145/3409331
|View full text |Cite
|
Sign up to set email alerts
|

Modular Tree Network for Source Code Representation Learning

Abstract: Learning representation for source code is a foundation of many program analysis tasks. In recent years, neural networks have already shown success in this area, but most existing models did not make full use of the unique structural information of programs. Although abstract syntax tree (AST)-based neural models can handle the tree structure in the source code, they cannot capture the richness of different types of substructure in programs. In this article, we propose a modular tree network that dynamically c… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
21
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
5
4

Relationship

0
9

Authors

Journals

citations
Cited by 39 publications
(21 citation statements)
references
References 35 publications
0
21
0
Order By: Relevance
“…More recently, Wang et al [21] introduced heterogeneous program graphs by including additional type information for nodes and edges in an AST and used GNNs to learn program properties. In another work, Wang et al [30] use a modular tree-based neural network to detect the semantic difference in code using AST. Some works use Data Flow Graphs to represent source code [31], [32].…”
Section: Evaluation and Resultsmentioning
confidence: 99%
“…More recently, Wang et al [21] introduced heterogeneous program graphs by including additional type information for nodes and edges in an AST and used GNNs to learn program properties. In another work, Wang et al [30] use a modular tree-based neural network to detect the semantic difference in code using AST. Some works use Data Flow Graphs to represent source code [31], [32].…”
Section: Evaluation and Resultsmentioning
confidence: 99%
“…The majority of the studies rely on the rnn-based dl model. Among them, some of the studies [21,53,133,333,339] employed lstm-based models; while others [54,135,152,348,360] used gru-based models. Among the other kinds of ml models, studies employed gnn-based [85,341], dnn [230], conditional random fields [22], svm [184,253], and cnn-based models [69,225,312].…”
Section: Model Trainingmentioning
confidence: 99%
“…Code representation learning aims at learning the semantics of programs for facilitating various downstream tasks related to program comprehension, such as code clone detection, code summarization, bug detection [30,7,10,28,14,27,15,11], etc. The development of deep learning techniques boosts the research on code representation learning.…”
Section: Code Representation Learningmentioning
confidence: 99%
“…In the work [7,10], code snippets are split into tokens and fed into neural networks such as RNNs and multi-head attentions for the representation learning. Considering the structural nature of code, [28,14,30] combine the abstract syntax trees (ASTs) into neural networks for capturing the code semantics. LeClair et al [14] use GNN-based encoder to model the AST of each program subroutine.…”
Section: Code Representation Learningmentioning
confidence: 99%