Even though there are various source code plagiarism detection approaches, most of them are only concerned with lexical similarities attack with an assumption that plagiarism is only conducted by students who are not proficient in programming. However, plagiarism is often conducted not only due to student incapability but also because of bad time management. Thus, semantic similarity attacks should be detected and evaluated. This research proposes a source code semantic similarity detection approach that can detect most source code similarities by representing the source code into an Abstract Syntax Tree (AST) and evaluating similarity using a Siamese neural network. Since AST is a language-dependent feature, the SOCO dataset is selected which consists of C++ program codes. Based on the evaluation, it can be concluded that our approach is more effective than most of the existing systems for detecting source code plagiarism. The proposed strategy was implemented and an experimental study based on the AI-SOCO dataset revealed that the proposed similarity measure achieved better performance for the recommendation system in terms of precision, recall, and f1 score by 15%, 10%, and 22% respectively in the 100,000 datasets. In the future, it is suggested that the system can be improved by detecting inter-language source code similarity.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.