2012
DOI: 10.1093/comjnl/bxs018
|View full text |Cite
|
Sign up to set email alerts
|

A Source Code Similarity System for Plagiarism Detection

Abstract: Source code plagiarism is an easy to do task, but very difficult to detect without proper tool support. Various source code similarity detection systems have been developed to help detect source code plagiarism. Those systems need to recognize a number of lexical and structural source code modifications. For example, by some structural modifications (e.g., modification of control structures, modification of data structures or structural redesign of source code) the source code can be changed in such a way that… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
45
0

Year Published

2015
2015
2024
2024

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 73 publications
(46 citation statements)
references
References 17 publications
1
45
0
Order By: Relevance
“…Different similarity measurements such as suffix trees, string alignment, Jaccard similarity, etc., can be applied to sequences or sets of tokens. Tools that rely on tokens include Sherlock (Joy and Luck 1999), BOSS (Joy et al 2005), Sim (Gitchell and Tran 1999), YAP3 (Wise 1996), JPlag (Prechelt et al 2002), CCFinder (Kamiya et al 2002), CP-Miner (Li et al 2006), MOSS (Schleimer et al 2003), Burrows et al (2007), and the Source Code Similarity Detector System (SCSDS) (Duric and Gasevic 2013). The token-based representation is widely used in source code similarity measurement and very efficient on a scale of millions SLOC.…”
Section: Code Similarity Measurementmentioning
confidence: 99%
See 3 more Smart Citations
“…Different similarity measurements such as suffix trees, string alignment, Jaccard similarity, etc., can be applied to sequences or sets of tokens. Tools that rely on tokens include Sherlock (Joy and Luck 1999), BOSS (Joy et al 2005), Sim (Gitchell and Tran 1999), YAP3 (Wise 1996), JPlag (Prechelt et al 2002), CCFinder (Kamiya et al 2002), CP-Miner (Li et al 2006), MOSS (Schleimer et al 2003), Burrows et al (2007), and the Source Code Similarity Detector System (SCSDS) (Duric and Gasevic 2013). The token-based representation is widely used in source code similarity measurement and very efficient on a scale of millions SLOC.…”
Section: Code Similarity Measurementmentioning
confidence: 99%
“…Using source and bytecode obfuscators, we can create pervasively modified source code that contains modifications of lexical and structural changes. We have investigated the code transformations offered by Artifice and ProGuard and found that they cover changes commonly found in both code cloning and code plagiarism as reported by , Schulze and Meyer (2013), Duric and Gasevic (2013), Joy and Luck (1999), and Brixtel et al (2010). The details of code modifications supported by our obfuscators are shown in Table 1.…”
Section: Obfuscatorsmentioning
confidence: 99%
See 2 more Smart Citations
“…number of operator and operand); and structure-based approach determines similarity based on source code structure. Among these approaches, structure-based approach is the most popular one due to its effectiveness [6]. This approach is quite sensitive to instruction order which is important on determining source code plagiarism.…”
Section: Introductionmentioning
confidence: 99%