2012 9th International Conference on Electrical Engineering, Computing Science and Automatic Control (CCE) 2012
DOI: 10.1109/iceee.2012.6421180
|View full text |Cite
|
Sign up to set email alerts
|

Batch source-code plagiarism detection using an algorithm for the bounded longest common subsequence problem

Abstract: Source-code plagiarism detection is an unfortunate but necessary activity when reviewing assignments of programming courses. While being reasonably easy to fool, string-based comparisons offer a high degree of accuracy with almost no false positives and usually a good string similarity metric is the length of their longest common subsequence. In the case of two strings, the dynamic programming algorithm for this calculation unfortunately takes quadratic time even if the strings are equal. In this paper we pres… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2013
2013
2022
2022

Publication Types

Select...
4
1
1

Relationship

0
6

Authors

Journals

citations
Cited by 7 publications
(3 citation statements)
references
References 7 publications
0
3
0
Order By: Relevance
“…Winnowing algorithm (Schleimer, Wilkerson, & Aiken, 2003) is employed reduce the number of terms in order to accelerate the detection process. LCS is used to compare two strings and to find out the longest overlapping path (Campos & Martinez, 2012;Iliopoulos & Rahman, 2009). The result shows that reducing non-relevant document shortens the processing time compared to non-reduced process.…”
Section: Brief Review Of Literaturementioning
confidence: 99%
“…Winnowing algorithm (Schleimer, Wilkerson, & Aiken, 2003) is employed reduce the number of terms in order to accelerate the detection process. LCS is used to compare two strings and to find out the longest overlapping path (Campos & Martinez, 2012;Iliopoulos & Rahman, 2009). The result shows that reducing non-relevant document shortens the processing time compared to non-reduced process.…”
Section: Brief Review Of Literaturementioning
confidence: 99%
“…the shortest common superstring. Such problems have important applications in bio-informatics as in sequence and genome analysis [12,22], in linguistic information retrieval [24]; plagiarism detection, for instance in publications [11] or source code [17,5]; data compression [21] and so on.…”
Section: Introductionmentioning
confidence: 99%
“…The Longest Common Subsequence (LCS) is a well known scoring-based method [15,16,17] and it is widely applied in pattern recognition such as hand gesture recognition [18], music matching [19], DNA sequence clustering [20] and code detection [21]. LCS is the comparison between different sequences to identify the longest subsequence structure which all the sequences contain.…”
Section: Introductionmentioning
confidence: 99%