2019
DOI: 10.1007/s10664-019-09697-7
|View full text |Cite
|
Sign up to set email alerts
|

Siamese: scalable and incremental code clone search via multiple code representations

Abstract: This paper presents a novel code clone search technique that is accurate, incremental, and scalable to hundreds of million lines of code. Our technique incorporates multiple code representations (i.e., a technique to transform code into various representations to capture different types of clones), query reduction (i.e., a technique to select clone search keywords based on their uniqueness), and a customised ranking function (i.e., a technique to allow a specific clone type to be ranked on top of the search re… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
48
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
4
2

Relationship

0
6

Authors

Journals

citations
Cited by 57 publications
(49 citation statements)
references
References 90 publications
0
48
0
Order By: Relevance
“…Many approaches exist to find the clones they are text-based [13], [14], token-based [15], [16], [17], [18]), tree-based [15]; [19], graph-based [16], or deep learning techniques [17] that match the similar codes among same project or different projects. Because of the large code repositories online, programmers never start coding from scratch which they obviously find it way to save time and avoid tedious work [1], [18], we can also find evident in the [18] to say that 70% of the code in GITHUB are clones.…”
Section: Fig1 Architecture Cross Language Clone Detectionmentioning
confidence: 99%
See 4 more Smart Citations
“…Many approaches exist to find the clones they are text-based [13], [14], token-based [15], [16], [17], [18]), tree-based [15]; [19], graph-based [16], or deep learning techniques [17] that match the similar codes among same project or different projects. Because of the large code repositories online, programmers never start coding from scratch which they obviously find it way to save time and avoid tedious work [1], [18], we can also find evident in the [18] to say that 70% of the code in GITHUB are clones.…”
Section: Fig1 Architecture Cross Language Clone Detectionmentioning
confidence: 99%
“…Because of several MLOC's of code available on Internet, code search is becoming more common now a day's [1]. It is easier for developers to get code online than to start coding from scratch [1].…”
Section: Introductionmentioning
confidence: 99%
See 3 more Smart Citations