Yoon-Chan Jhi scite author profile

Abstract-Along with the burst of open source projects, software theft (or plagiarism) has become a very serious threat to the healthiness of software industry. Software birthmark, which represents the unique characteristic of a program, can be used for software theft detection. We propose two system call based software birthmarks: SCSSB (System Call Short Sequence Birthmark) and IDSCSB (Input Dependant System Call Subsequence Birthmark), and examine how well they reflect unique behavioral characteristics of a program. To our knowledge, our detection system based on SCSSB and IDSCSB is the first one that is capable of software component theft detection where only partial code is stolen. We demonstrate the strength of our birthmarks against various evasion techniques, including those based on different compilers and different compiler optimization levels as well as those based on very powerful obfuscation techniques supported by SandMark. Unlike the existing work that were evaluated through small or toy software, we also evaluate our birthmarks on a set of large software (web browsers). Our results show that system call based birthmarks are very practical and effective in detecting software theft that even adopts advanced evasion techniques.

show abstract

Value-based program characterization and its application to software plagiarism detection

Jhi

Wang

Jia

et al. 2011

View full text Add to dashboard Cite

Identifying similar or identical code fragments becomes much more challenging in code theft cases where plagiarizers can use various automated code transformation techniques to hide stolen code from being detected. Previous works in this field are largely limited in that (1) most of them cannot handle advanced obfuscation techniques; (2) the methods based on source code analysis are less practical since the source code of suspicious programs is typically not available until strong evidences are collected; and (3) those depending on the features of specific operating systems or programming languages have limited applicability.Based on an observation that some critical runtime values are hard to be replaced or eliminated by semanticspreserving transformation techniques, we introduce a novel approach to dynamic characterization of executable programs. Leveraging such invariant values, our technique is resilient to various control and data obfuscation techniques. We show how the values can be extracted and refined to expose the critical values and how we can apply this runtime property to help solve problems in software plagiarism detection. We have implemented a prototype with a dynamic taint analyzer atop a generic processor emulator. Our experimental results show that the value-based method successfully discriminates 34 plagiarisms obfuscated by SandMark, plagiarisms heavily obfuscated by KlassMaster, programs obfuscated by Thicket, and executables obfuscated by Loco/Diablo.

show abstract

Behavior based software theft detection

Wang

Jhi

Zhu

et al. 2009

View full text Add to dashboard Cite

Along with the burst of open source projects, software theft (or plagiarism) has become a very serious threat to the healthiness of software industry. Software birthmark, which represents the unique characteristics of a program, can be used for software theft detection. We propose a system call dependence graph based software birthmark called SCDG birthmark, and examine how well it reflects unique behavioral characteristics of a program. To our knowledge, our detection system based on SCDG birthmark is the first one that is capable of detecting software component theft where only partial code is stolen. We demonstrate the strength of our birthmark against various evasion techniques, including those based on different compilers and different compiler optimization levels as well as two state-of-the-art obfuscation tools. Unlike the existing work that were evaluated through small or toy software, we also evaluate our birthmark on a set of large software. Our results show that SCDG birthmark is very practical and effective in detecting software theft that even adopts advanced evasion techniques.

show abstract

STILL: Exploit Code Detection via Static Taint and Initialization Analyses

Wang

Jhi

Zhu

et al. 2008

View full text Add to dashboard Cite

show abstract

IMPACT: Impersonation Attack Detection via Edge Computing Using Deep Autoencoder and Feature Abstraction

et al. 2020

View full text Add to dashboard Cite

An ever-increasing number of computing devices interconnected through wireless networks encapsulated in the cyber-physical-social systems and a significant amount of sensitive network data transmitted among them have raised security and privacy concerns. Intrusion detection system (IDS) is known as an effective defence mechanism and most recently machine learning (ML) methods are used for its development. However, Internet of Things (IoT) devices often have limited computational resources such as limited energy source, computational power and memory, thus, traditional ML-based IDS that require extensive computational resources are not suitable for running on such devices. This study thus is to design and develop a lightweight ML-based IDS tailored for the resource-constrained devices. Specifically, the study proposes a lightweight ML-based IDS model namely IMPACT (IMPersonation Attack deteCTion using deep auto-encoder and feature abstraction). This is based on deep feature learning with gradient-based linear Support Vector Machine (SVM) to deploy and run on resource-constrained devices by reducing the number of features through feature extraction and selection using a stacked autoencoder (SAE), mutual information (MI) and C4.8 wrapper. The IMPACT is trained on Aegean Wi-Fi Intrusion Dataset (AWID) to detect impersonation attack. Numerical results show that the proposed IMPACT achieved 98.22% accuracy with 97.64% detection rate and 1.20% false alarm rate and outperformed existing state-of-the-art benchmark models. Another key contribution of this study is the investigation of the features in AWID dataset for its usability for further development of IDS.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Yoon-Chan Jhi

Detecting Software Theft via System Call Based Birthmarks

Value-based program characterization and its application to software plagiarism detection

Behavior based software theft detection

STILL: Exploit Code Detection via Static Taint and Initialization Analyses

IMPACT: Impersonation Attack Detection via Edge Computing Using Deep Autoencoder and Feature Abstraction

Contact Info

Product

Resources

About