Proceedings 2018 Network and Distributed System Security Symposium 2018
DOI: 10.14722/ndss.2018.23304
|View full text |Cite
|
Sign up to set email alerts
|

When Coding Style Survives Compilation: De-anonymizing Programmers from Executable Binaries

Abstract: The ability to identify authors of computer programs based on their coding style is a direct threat to the privacy and anonymity of programmers. While recent work found that source code can be attributed to authors with high accuracy, attribution of executable binaries appears to be much more difficult. Many distinguishing features present in source code, e.g. variable names, are removed in the compilation process, and compiler optimization may alter the structure of a program, further obscuring features that … Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
121
0

Year Published

2018
2018
2021
2021

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 69 publications
(122 citation statements)
references
References 29 publications
1
121
0
Order By: Relevance
“…To address the challenges of achieving formal utility-loss guarantees, e.g., 0 label loss and bounded confidence score distortion, we design new methods to find adversarial examples. Other than membership inference attacks, many other attacks rely on machine learning classifiers, e.g., attribute inference attacks [11,17,28], website fingerprinting attacks [7,22,29,46,67], side-channel attacks [73], location attacks [5,45,52,72], and author identification attacks [8,41]. For instance, online social network users are vulnerable to attribute inference attacks, in which an attacker leverages a machine learning classifier to infer users' private attributes (e.g., gender, political view, and sexual orientation) using their public data (e.g., page likes) on social networks.…”
Section: Discussion and Limitationsmentioning
confidence: 99%
“…To address the challenges of achieving formal utility-loss guarantees, e.g., 0 label loss and bounded confidence score distortion, we design new methods to find adversarial examples. Other than membership inference attacks, many other attacks rely on machine learning classifiers, e.g., attribute inference attacks [11,17,28], website fingerprinting attacks [7,22,29,46,67], side-channel attacks [73], location attacks [5,45,52,72], and author identification attacks [8,41]. For instance, online social network users are vulnerable to attribute inference attacks, in which an attacker leverages a machine learning classifier to infer users' private attributes (e.g., gender, political view, and sexual orientation) using their public data (e.g., page likes) on social networks.…”
Section: Discussion and Limitationsmentioning
confidence: 99%
“…When the unpacking routine has finished its run, the execution pointer jumps to the first instruction of the original program. For example, UPX (Ultimate Packer for Executables) 3 is a free and open-source executable packer which mainly compresses the executable rather than obfuscating them. Authors may use this technique for faster loading their program into memory due to low file size.…”
Section: Other Methods Of Protecting Binariesmentioning
confidence: 99%
“…In some prominent studies such as [14], [2], authors have utilized machine learning techniques to correlate syntax-based features with authorship to identify the author of program binaries. In [3], authors have analyzed the effects of compiler optimization (in three levels), removing symbol information and applying basic binary obfuscation methods (such as instruction replacement and control flow graph obfuscation) on several features mainly obtained from disassembling and decompiling the executable binaries (e.g. token n-grams and features driven from the Abstract Syntax Tree).…”
Section: Other Methods Of Protecting Binariesmentioning
confidence: 99%
“…Rosenblum et al [21] apply machine learning to style features extracted from binaries; Caliskan-Islam et al [6] build on this work. Muir and Wikström [17] find that changing compiler settings and linking statically can be used to decrease attribution accuracy and obfuscate authorship on binary attribution classifiers -the only other work, to our knowledge, focused on authorship obfuscation for programs.…”
Section: Classifying Binariesmentioning
confidence: 99%