2006
DOI: 10.1109/tse.2006.28
|View full text |Cite
|
Sign up to set email alerts
|

CP-Miner: finding copy-paste and related bugs in large-scale software code

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

5
490
1
3

Year Published

2008
2008
2021
2021

Publication Types

Select...
6
1
1

Relationship

0
8

Authors

Journals

citations
Cited by 608 publications
(513 citation statements)
references
References 25 publications
5
490
1
3
Order By: Relevance
“…Different similarity measurements such as suffix trees, string alignment, Jaccard similarity, etc., can be applied to sequences or sets of tokens. Tools that rely on tokens include Sherlock (Joy and Luck 1999), BOSS (Joy et al 2005), Sim (Gitchell and Tran 1999), YAP3 (Wise 1996), JPlag (Prechelt et al 2002), CCFinder (Kamiya et al 2002), CP-Miner (Li et al 2006), MOSS (Schleimer et al 2003), Burrows et al (2007), and the Source Code Similarity Detector System (SCSDS) (Duric and Gasevic 2013). The token-based representation is widely used in source code similarity measurement and very efficient on a scale of millions SLOC.…”
Section: Code Similarity Measurementmentioning
confidence: 99%
See 2 more Smart Citations
“…Different similarity measurements such as suffix trees, string alignment, Jaccard similarity, etc., can be applied to sequences or sets of tokens. Tools that rely on tokens include Sherlock (Joy and Luck 1999), BOSS (Joy et al 2005), Sim (Gitchell and Tran 1999), YAP3 (Wise 1996), JPlag (Prechelt et al 2002), CCFinder (Kamiya et al 2002), CP-Miner (Li et al 2006), MOSS (Schleimer et al 2003), Burrows et al (2007), and the Source Code Similarity Detector System (SCSDS) (Duric and Gasevic 2013). The token-based representation is widely used in source code similarity measurement and very efficient on a scale of millions SLOC.…”
Section: Code Similarity Measurementmentioning
confidence: 99%
“…These techniques show positive results and open further possibilities in this research area. Examples of these techniques include Software Bertillonage (Davies et al 2013), Kolmogorov complexity (Li and Vitâanyi 2008), Latent Semantic Indexing (LSI) (McMillan et al 2012), and Latent Semantic Analysis (LSA) (Cosma and Joy 2012).…”
Section: Code Similarity Measurementmentioning
confidence: 99%
See 1 more Smart Citation
“…Many approaches have been developed over the years to detect code clones [20,23,25,26]. A code clone is two or more segments of code that have the same semantics but come from different sources.…”
Section: Code Clones and Reuse Detectionmentioning
confidence: 99%
“…Different analysis tasks pay attention to different application aspects, such as system failure tracing [19,10], event correlation discovery [20,18,16], and event based trend analysis [5,6,7]. In practice, these methods are often conducted when the analysts already have some prior knowledge about the data.…”
Section: Related Workmentioning
confidence: 99%