2011 Sixth International Conference on Availability, Reliability and Security 2011
DOI: 10.1109/ares.2011.35
|View full text |Cite
|
Sign up to set email alerts
|

Accurate Adware Detection Using Opcode Sequence Extraction

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
5
0

Year Published

2016
2016
2024
2024

Publication Types

Select...
6
2
2

Relationship

0
10

Authors

Journals

citations
Cited by 23 publications
(8 citation statements)
references
References 13 publications
1
5
0
Order By: Relevance
“…In Table 1, we show statistics on the number of unique n-gram opcodes obtained from our dataset, which has a total of 2000 benign and malware samples. Our findings attest to other research [7,8,25] that the number of unique n-grams increases proportionally to the size of n. Since machine learning classifiers only understand features in numerical representations, we vectorize each sample's n-gram opcode sequences using the term frequency-inverse document frequency (TF-IDF) [26,27]. TF-IDF works by creating a dictionary of unique n-gram opcode sequences and then measures the frequency of occurrence of each unique n-gram opcode within a given sample using the term frequency (TF) and with inverse document frequency (IDF), measures the importance of the unique n-gram opcode on the basis of frequency of occurrence across the entire corpus.…”
Section: Fig 2 Example Of N-gram Opcode Sequences Generationsupporting
confidence: 92%
“…In Table 1, we show statistics on the number of unique n-gram opcodes obtained from our dataset, which has a total of 2000 benign and malware samples. Our findings attest to other research [7,8,25] that the number of unique n-grams increases proportionally to the size of n. Since machine learning classifiers only understand features in numerical representations, we vectorize each sample's n-gram opcode sequences using the term frequency-inverse document frequency (TF-IDF) [26,27]. TF-IDF works by creating a dictionary of unique n-gram opcode sequences and then measures the frequency of occurrence of each unique n-gram opcode within a given sample using the term frequency (TF) and with inverse document frequency (IDF), measures the importance of the unique n-gram opcode on the basis of frequency of occurrence across the entire corpus.…”
Section: Fig 2 Example Of N-gram Opcode Sequences Generationsupporting
confidence: 92%
“…Besides CNN, RNN has also been used for malware analysis. [28] and [29] proposed techniques with LSTM using opcode sequences of malware. Santos et al [30] proposed a hybrid technique by integrating both static and dynamic analysis.…”
Section: Related Workmentioning
confidence: 99%
“…As the model is trained on disassembled virus executables, the quality of the disassembler may affects the results [22]. Other than that, since it just retrieves part of the program, it may miss some important information of malicious code [23]. But the computation time is faster as the data size is smaller.…”
Section: Related Workmentioning
confidence: 99%