Abstract-Source code plagiarism is a severe problem in academia. In academia programming assignments are used to evaluate students in programming courses. Therefore checking programming assignments for plagiarism is essential. If a course consists of a large number of students, it is impractical to check each assignment by a human inspector. Therefore it is essential to have automated tools in order to assist detection of plagiarism in programming assignments.Majority of the current source code plagiarism detection tools are based on structured methods. Structural properties of a plagiarized program and the original program differ significantly. Therefore it is hard to detect plagiarized programs when plagiarism level is 4 or above by using tools which are based on structural methods. This paper presents a new plagiarism detection method, which is based on machine learning techniques. We have trained and tested three machine learning algorithms for detecting source code plagiarism. Furthermore, we have utilized a meta-learning algorithm in order to improve the accuracy of our system.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.