2014 IEEE International Conference on Software Maintenance and Evolution 2014
DOI: 10.1109/icsme.2014.77
|View full text |Cite
|
Sign up to set email alerts
|

Towards a Big Data Curated Benchmark of Inter-project Code Clones

Abstract: Recently, new applications of code clone detection and search have emerged that rely upon clones detected across thousands of software systems. Big data clone detection and search algorithms have been proposed as an embedded part of these new applications. However, there exists no previous benchmark data for evaluating the recall and precision of these emerging techniques. In this paper, we present a big data clone detection benchmark that consists of known true and false positive clones in a big data inter-pr… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
169
0
1

Year Published

2016
2016
2024
2024

Publication Types

Select...
4
2

Relationship

1
5

Authors

Journals

citations
Cited by 238 publications
(172 citation statements)
references
References 12 publications
2
169
0
1
Order By: Relevance
“…We then demonstrate SourcererCC's execution for a Big Data inter-project repository, one of the prime targets of scalable clone detection. We measure its clone recall using two benchmarks: The Mutation and Injection Framework [23,35] and BigCloneBench [30,34]. We measure the precision of our tool by manually validating a statistically significant sample of its output for the BigCloneBench experiment.…”
Section: Discussionmentioning
confidence: 99%
See 4 more Smart Citations
“…We then demonstrate SourcererCC's execution for a Big Data inter-project repository, one of the prime targets of scalable clone detection. We measure its clone recall using two benchmarks: The Mutation and Injection Framework [23,35] and BigCloneBench [30,34]. We measure the precision of our tool by manually validating a statistically significant sample of its output for the BigCloneBench experiment.…”
Section: Discussionmentioning
confidence: 99%
“…However, we recognized that a modern benchmark of real clones is also required. So we developed an efficient clone validation strategy based on code functionality and built BigCloneBench [30], a Big Data clone benchmark containing 8 million validated clones within and between 25,000 opensource projects. It measures recall for an extensive variety of real clones produced by real developers.…”
Section: Recallmentioning
confidence: 99%
See 3 more Smart Citations