Proceedings 2019 Workshop on Binary Analysis Research 2019
DOI: 10.14722/bar.2019.23057
|View full text |Cite
|
Sign up to set email alerts
|

A Cross-Architecture Instruction Embedding Model for Natural Language Processing-Inspired Binary Code Analysis

Abstract: Given a closed-source program, such as most of proprietary software and viruses, binary code analysis is indispensable for many tasks, such as code plagiarism detection and malware analysis. Today, source code is very often compiled for various architectures, making cross-architecture binary code analysis increasingly important. A binary, after being disassembled, is expressed in an assembly languages. Thus, recent work starts exploring Natural Language Processing (NLP) inspired binary code analysis. In NLP, w… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
23
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
6
2

Relationship

0
8

Authors

Journals

citations
Cited by 39 publications
(27 citation statements)
references
References 42 publications
1
23
0
Order By: Relevance
“…1:1 mapping phase that is similar to the function matching research and classification phase using the semantic aware neural network. In the literature, function matching is addressed by using traditional featurebased approaches [1], [7], [3], [8], [2], [5], [4], [9], [10] and also by using the deep learning approaches [11], [12], [13] with an objective to find the similarity between two functions. In contrast, our neural network-based approach aims to find the similarity and differences in two binary functions.…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…1:1 mapping phase that is similar to the function matching research and classification phase using the semantic aware neural network. In the literature, function matching is addressed by using traditional featurebased approaches [1], [7], [3], [8], [2], [5], [4], [9], [10] and also by using the deep learning approaches [11], [12], [13] with an objective to find the similarity between two functions. In contrast, our neural network-based approach aims to find the similarity and differences in two binary functions.…”
Section: Related Workmentioning
confidence: 99%
“…Zuo et al [12] proposed INNEREYE that uses LSTM to treat instructions as words and basic blocks as sentences and train the neural network to compare two basic block embeddings for cross-architecture to predict their similarity score. Redmond et al [13] extends the Zuo [12] work using joint learning approach to generate an instruction embedding. Lie et al [29] uses a combination of distance features to find similarities between two functions.…”
Section: Embedding Structural Featuresmentioning
confidence: 99%
See 1 more Smart Citation
“…Zuo et al's work [10] solved the task of binary similarity by converting a basic block into an embedding and measuring the distance between two embeddings, and instructions in a basic block are combined through the use of a Long Short-Term Memory (LSTM). Redmond et al [11] proposed a joint learning approach to generating instruction embeddings that capture not only the semantics of instructions within architecture but also their semantic relationships across architectures. SAFE [12], proposed by Massarelli et al, is a general architecture for calculating binary function embeddings starting from disassembled binaries, using a self-attentive recurrent neural network that parses all instructions according to their addresses.…”
Section: ) Nlp-based Binary Code Similarity Detectionmentioning
confidence: 99%
“…Inspired by NLP(natural language processing), Baldoni et al [21] embed the instructions with word2vec model and optimize the hyperparameters using siamese structure. Redmond et al [22] explore binary instruction embedding across architectures. They convert the binary code to intermediate language and recorded the input/output as signature for comparison.…”
Section: Introductionmentioning
confidence: 99%