Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence 2017
DOI: 10.24963/ijcai.2017/423
|View full text |Cite
|
Sign up to set email alerts
|

Supervised Deep Features for Software Functional Clone Detection by Exploiting Lexical and Syntactical Information in Source Code

Abstract: Software clone detection, aiming at identifying out code fragments with similar functionalities, has played an important role in software maintenance and evolution. Many clone detection approaches have been proposed. However, most of them represent source codes with hand-crafted features using lexical or syntactical information, or unsupervised deep features, which makes it difficult to detect the functional clone pairs, i.e., pieces of codes with similar functionality but differing in both syntactical and lex… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

1
205
0
1

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 246 publications
(220 citation statements)
references
References 9 publications
1
205
0
1
Order By: Relevance
“…Since the majority of code clone pairs are Weak Type-3/Type-4 clones, BigCloneBench is quite appropriate to be used for evaluating semantic clone detection. In our experiment, we follow the settings of in the CDLH paper [3], which discard code fragments without any tagged true or false clone pairs, left with 9,134 code fragments. Table II shows the basic information about the two datasets in our experiment.…”
Section: A Experiments Datamentioning
confidence: 99%
See 4 more Smart Citations
“…Since the majority of code clone pairs are Weak Type-3/Type-4 clones, BigCloneBench is quite appropriate to be used for evaluating semantic clone detection. In our experiment, we follow the settings of in the CDLH paper [3], which discard code fragments without any tagged true or false clone pairs, left with 9,134 code fragments. Table II shows the basic information about the two datasets in our experiment.…”
Section: A Experiments Datamentioning
confidence: 99%
“…CDLH [3] uses binary Tree-LSTM [5] to encode ASTs, and a hash function to optimize the distance between the vector representation of AST pairs by hamming distance. ASTNN [4] uses recursive neural networks to encode AST subtrees for statements, then feed the encodings of all statement trees into an RNN to compute the vector representation for a program.…”
Section: B Experiments Settingsmentioning
confidence: 99%
See 3 more Smart Citations