2007 37th Annual Frontiers in Education Conference - Global Engineering: Knowledge Without Borders, Opportunities Without Passp 2007
DOI: 10.1109/fie.2007.4417860
|View full text |Cite
|
Sign up to set email alerts
|

Fast and reliable plagiarism detection system

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
14
0

Year Published

2010
2010
2024
2024

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 16 publications
(14 citation statements)
references
References 10 publications
0
14
0
Order By: Relevance
“…developed with the aim to improve the speed of similarity detection by using an indexed data structure to store files [9,36]. The tokenenized versions of source code files are compared using an algorithm similar to the RKR-GST algorithm.…”
Section: Fpds (Fast Plagiarism Detection System) Is a Source Code Simmentioning
confidence: 99%
“…developed with the aim to improve the speed of similarity detection by using an indexed data structure to store files [9,36]. The tokenenized versions of source code files are compared using an algorithm similar to the RKR-GST algorithm.…”
Section: Fpds (Fast Plagiarism Detection System) Is a Source Code Simmentioning
confidence: 99%
“…This toolbox may be integrated into open source VLE Moodle. Other scholars deal with plagiarism in the area of using visualization method to find plagiarism in automated student assessments (Graven and MacKinnon 2008), and improving plagiarism detecting systems for the fastest and the most reliable (Mozgovoy et al 2007).…”
Section: Related Workmentioning
confidence: 99%
“…Statistical analysis and data mining of outliers is also a possible source to find out useful information for course instructors, for example, recognition of unconcerned students (Mozgovoy et al 2007), etc.…”
Section: Conclusion and Further Investigationmentioning
confidence: 99%
“…(I) Euclidean distances on the tf-idf weights like in the previous data set, however, tf and idf now refer to the occurrence of each token instead of term, (II) the Cosine distance on the token frequencies, (III) the normalized compression distance (NCD) on the token streams, (IV) Greedy String Tiling (GST) which is the inherent similarity measure that Plaggie uses to compare the given sources [29,30]; since GST yields a matrix S of pairwise similarities s(x i , x j ) ∈ S, where values are in (0, 1) and self-similarities equal 1, we converted S into a dissimilarity matrix by taking D := √ 1 − S, as proposed in [23]. Fig.…”
Section: Java Programsmentioning
confidence: 99%
“…We used the open source plagiarism detection software Plaggie [29] to extract a tokenized representation (a token stream) from each given Java source code. Based on the token streams, we consider four different dissimilarity measures:…”
Section: Java Programsmentioning
confidence: 99%