Dinghao Wu scite author profile

Existing code similarity comparison methods, whether source or binary code based, are mostly not resilient to obfuscations. In the case of software plagiarism, emerging obfuscation techniques have made automated detection increasingly difficult.In this paper, we propose a binary-oriented, obfuscationresilient method based on a new concept, longest common subsequence of semantically equivalent basic blocks, which combines rigorous program semantics with longest common subsequence based fuzzy matching. We model the semantics of a basic block by a set of symbolic formulas representing the input-output relations of the block. This way, the semantics equivalence (and similarity) of two blocks can be checked by a theorem prover. We then model the semantics similarity of two paths using the longest common subsequence with basic blocks as elements. This novel combination has resulted in strong resiliency to code obfuscation. We have developed a prototype and our experimental results show that our method is effective and practical when applied to real-world software.

show abstract

Subtype lesions of neovascular age-related macular degeneration in Chinese patients

Wen

Huang

et al. 2007

Graefes Arch Clin Exp Ophthalmol

159

View full text Add to dashboard Cite

show abstract

Get Online Support, Feel Better -- Sentiment Analysis and Dynamics in an Online Cancer Survivor Community

Qiu

Zhao

Mitra

et al. 2011

View full text Add to dashboard Cite

Polypoidal choroidal vasculopathy in elderly Chinese patients

Wen

Chen

et al. 2004

Graefe's Arch Clin Exp Ophthalmol

129

View full text Add to dashboard Cite

show abstract

Value-based program characterization and its application to software plagiarism detection

Jhi

Wang

Jia

et al. 2011

View full text Add to dashboard Cite

Identifying similar or identical code fragments becomes much more challenging in code theft cases where plagiarizers can use various automated code transformation techniques to hide stolen code from being detected. Previous works in this field are largely limited in that (1) most of them cannot handle advanced obfuscation techniques; (2) the methods based on source code analysis are less practical since the source code of suspicious programs is typically not available until strong evidences are collected; and (3) those depending on the features of specific operating systems or programming languages have limited applicability.Based on an observation that some critical runtime values are hard to be replaced or eliminated by semanticspreserving transformation techniques, we introduce a novel approach to dynamic characterization of executable programs. Leveraging such invariant values, our technique is resilient to various control and data obfuscation techniques. We show how the values can be extracted and refined to expose the critical values and how we can apply this runtime property to help solve problems in software plagiarism detection. We have implemented a prototype with a dynamic taint analyzer atop a generic processor emulator. Our experimental results show that the value-based method successfully discriminates 34 plagiarisms obfuscated by SandMark, plagiarisms heavily obfuscated by KlassMaster, programs obfuscated by Thicket, and executables obfuscated by Loco/Diablo.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Dinghao Wu

Semantics-based obfuscation-resilient binary code similarity comparison with applications to software plagiarism detection

Subtype lesions of neovascular age-related macular degeneration in Chinese patients

Get Online Support, Feel Better -- Sentiment Analysis and Dynamics in an Online Cancer Survivor Community

Polypoidal choroidal vasculopathy in elderly Chinese patients

Value-based program characterization and its application to software plagiarism detection

Contact Info

Product

Resources

About