2015
DOI: 10.11113/jt.v77.3558
|View full text |Cite
|
Sign up to set email alerts
|

Feature Selection and Machine Learning Classification for Malware Detection

Abstract: Malware is a computer security problem that can morph to evade traditional detection methods based on known signature matching. Since new malware variants contain patterns that are similar to those in observed malware, machine learning techniques can be used to identify new malware. This work presents a comparative study of several feature selection methods with four different machine learning classifiers in the context of static malware detection based on n-grams analysis. The result shows that the use of Pri… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
18
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
7
3

Relationship

0
10

Authors

Journals

citations
Cited by 36 publications
(18 citation statements)
references
References 29 publications
0
18
0
Order By: Relevance
“…The proposed method extracts n-gram features from the content of the file, and then filters the huge number of n-gram features. Snort sub-signature is used as a first stage filtering process and only the features that exist in Snort sub-signature which are different from the previous work [22] are selected. A second filter stage, the feature selection method has been used.…”
Section: Proposed Malware Detection Methodsmentioning
confidence: 99%
“…The proposed method extracts n-gram features from the content of the file, and then filters the huge number of n-gram features. Snort sub-signature is used as a first stage filtering process and only the features that exist in Snort sub-signature which are different from the previous work [22] are selected. A second filter stage, the feature selection method has been used.…”
Section: Proposed Malware Detection Methodsmentioning
confidence: 99%
“…This process is repeated k times. Finally, the average of k results is calculated to determine classifier performance [44]. In this study, k was selected as 10.…”
Section: Validation Of Classifiersmentioning
confidence: 99%
“…For this reason, the information gain of each feature is experimented. In order to experiment the information gain each used feature provides, Cor-relationAttributeEval attribute evaluator, which evaluates the worth of an attribute by measuring the correlation between it and the class [12,20,42,43,56], is used with the Ranker search method. As the experimental results listed in Table 3 show the novel feature number of lines of code provides the best information gain.…”
Section: Feature Selectionmentioning
confidence: 99%