Abstract-Along with the burst of open source projects, software theft (or plagiarism) has become a very serious threat to the healthiness of software industry. Software birthmark, which represents the unique characteristic of a program, can be used for software theft detection. We propose two system call based software birthmarks: SCSSB (System Call Short Sequence Birthmark) and IDSCSB (Input Dependant System Call Subsequence Birthmark), and examine how well they reflect unique behavioral characteristics of a program. To our knowledge, our detection system based on SCSSB and IDSCSB is the first one that is capable of software component theft detection where only partial code is stolen. We demonstrate the strength of our birthmarks against various evasion techniques, including those based on different compilers and different compiler optimization levels as well as those based on very powerful obfuscation techniques supported by SandMark. Unlike the existing work that were evaluated through small or toy software, we also evaluate our birthmarks on a set of large software (web browsers). Our results show that system call based birthmarks are very practical and effective in detecting software theft that even adopts advanced evasion techniques.
Identifying similar or identical code fragments becomes much more challenging in code theft cases where plagiarizers can use various automated code transformation techniques to hide stolen code from being detected. Previous works in this field are largely limited in that (1) most of them cannot handle advanced obfuscation techniques; (2) the methods based on source code analysis are less practical since the source code of suspicious programs is typically not available until strong evidences are collected; and (3) those depending on the features of specific operating systems or programming languages have limited applicability.Based on an observation that some critical runtime values are hard to be replaced or eliminated by semanticspreserving transformation techniques, we introduce a novel approach to dynamic characterization of executable programs. Leveraging such invariant values, our technique is resilient to various control and data obfuscation techniques. We show how the values can be extracted and refined to expose the critical values and how we can apply this runtime property to help solve problems in software plagiarism detection. We have implemented a prototype with a dynamic taint analyzer atop a generic processor emulator. Our experimental results show that the value-based method successfully discriminates 34 plagiarisms obfuscated by SandMark, plagiarisms heavily obfuscated by KlassMaster, programs obfuscated by Thicket, and executables obfuscated by Loco/Diablo.
Along with the burst of open source projects, software theft (or plagiarism) has become a very serious threat to the healthiness of software industry. Software birthmark, which represents the unique characteristics of a program, can be used for software theft detection. We propose a system call dependence graph based software birthmark called SCDG birthmark, and examine how well it reflects unique behavioral characteristics of a program. To our knowledge, our detection system based on SCDG birthmark is the first one that is capable of detecting software component theft where only partial code is stolen. We demonstrate the strength of our birthmark against various evasion techniques, including those based on different compilers and different compiler optimization levels as well as two state-of-the-art obfuscation tools. Unlike the existing work that were evaluated through small or toy software, we also evaluate our birthmark on a set of large software. Our results show that SCDG birthmark is very practical and effective in detecting software theft that even adopts advanced evasion techniques.
An ever-increasing number of computing devices interconnected through wireless networks encapsulated in the cyber-physical-social systems and a significant amount of sensitive network data transmitted among them have raised security and privacy concerns. Intrusion detection system (IDS) is known as an effective defence mechanism and most recently machine learning (ML) methods are used for its development. However, Internet of Things (IoT) devices often have limited computational resources such as limited energy source, computational power and memory, thus, traditional ML-based IDS that require extensive computational resources are not suitable for running on such devices. This study thus is to design and develop a lightweight ML-based IDS tailored for the resource-constrained devices. Specifically, the study proposes a lightweight ML-based IDS model namely IMPACT (IMPersonation Attack deteCTion using deep auto-encoder and feature abstraction). This is based on deep feature learning with gradient-based linear Support Vector Machine (SVM) to deploy and run on resource-constrained devices by reducing the number of features through feature extraction and selection using a stacked autoencoder (SAE), mutual information (MI) and C4.8 wrapper. The IMPACT is trained on Aegean Wi-Fi Intrusion Dataset (AWID) to detect impersonation attack. Numerical results show that the proposed IMPACT achieved 98.22% accuracy with 97.64% detection rate and 1.20% false alarm rate and outperformed existing state-of-the-art benchmark models. Another key contribution of this study is the investigation of the features in AWID dataset for its usability for further development of IDS.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.