2014
DOI: 10.1016/j.scico.2011.11.002
|View full text |Cite
|
Sign up to set email alerts
|

Tuning research tools for scalability and performance: The NiCad experience

Abstract: Clone detection is a research technique for analyzing software systems for similarities, with applications in software understanding, maintenance, evolution, license enforcement and many other issues. The NiCad near-miss clone detection method has been shown to yield highly accurate results in both precision and recall. However, its naive two-step method, involving a parsing first step to identify and normalize code fragments, followed by a text line-based second step using longest common subsequence (LCS) to … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
9
0

Year Published

2014
2014
2019
2019

Publication Types

Select...
4
2

Relationship

3
3

Authors

Journals

citations
Cited by 10 publications
(9 citation statements)
references
References 24 publications
0
9
0
Order By: Relevance
“…The last two techniques allow to discard or to abstract in the comparison, respectively, a subset of the nonterminals of the language-specific grammar employed by the parser. NiCad has been chosen as reference because it is a mature and flexible tool [23], which allows for customization of both the language grammar and of the detection process, making it possible to detect Type-2 Java clones. NiCad can be configured to recognize the clones as defined in Section 3.1 if one allows for:…”
Section: Resultsmentioning
confidence: 99%
“…The last two techniques allow to discard or to abstract in the comparison, respectively, a subset of the nonterminals of the language-specific grammar employed by the parser. NiCad has been chosen as reference because it is a mature and flexible tool [23], which allows for customization of both the language grammar and of the detection process, making it possible to detect Type-2 Java clones. NiCad can be configured to recognize the clones as defined in Section 3.1 if one allows for:…”
Section: Resultsmentioning
confidence: 99%
“…Different setups in corresponding clone types might result in different stability scenarios. However, the NiCad setups that we have used for detecting three types clones are considered standard [23,24,26,4,29] and thus we contend that the clone detection results that we have investigated are reliable.…”
Section: Threats To Validitymentioning
confidence: 89%
“…The raw samples from GitHub, however, contain clones which may affect the training performance. To reduce the impact of duplicated data on classification results, we use NiCad [13] to detect clones among the data, and remove all of the Type 1, Type 2 and Type 3 clones with a dissimilarity lower than 10%. After the clone removal, we obtained 4,932 algorithm files in Java and 4,732 for C++.…”
Section: A Datasetsmentioning
confidence: 99%