2013 IEEE International Conference on Acoustics, Speech and Signal Processing 2013
DOI: 10.1109/icassp.2013.6638293
|View full text |Cite
|
Sign up to set email alerts
|

Large-scale malware classification using random projections and neural networks

Abstract: Automatically generated malware is a significant problem for computer users. Analysts are able to manually investigate a small number of unknown files, but the best large-scale defense for detecting malware is automated malware classification. Malware classifiers often use sparse binary features, and the number of potential features can be on the order of tens or hundreds of millions. Feature selection reduces the number of features to a manageable number for training simpler algorithms such as logistic regres… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

6
169
0

Year Published

2015
2015
2021
2021

Publication Types

Select...
5
2
2

Relationship

0
9

Authors

Journals

citations
Cited by 341 publications
(175 citation statements)
references
References 12 publications
6
169
0
Order By: Relevance
“…In our experiments the logistic regression classifier outperformed Naive Bayes, SVM and Decision Trees implementations from [16], verifying a high performance previously observed in [6]. The schema for our approach is shown in Figure 2.…”
Section: Building An Ensemblesupporting
confidence: 82%
See 1 more Smart Citation
“…In our experiments the logistic regression classifier outperformed Naive Bayes, SVM and Decision Trees implementations from [16], verifying a high performance previously observed in [6]. The schema for our approach is shown in Figure 2.…”
Section: Building An Ensemblesupporting
confidence: 82%
“…We consider [6], where random projections were used to reduce the feature space (sparse binary features, API trigrams and API calls) to classify Windows malware on a dataset of several million samples, to be the highestimpact contribution to the dimensionality reduction problem in malware classification. Although their work is not directly dealing with Android malware, we consider this publication to be very relevant due to its tackling a similar large-scale classification problem.…”
Section: Related Literature Reviewmentioning
confidence: 99%
“…Thus, there is a need to analyze every new malware sample to see if it comes from an already known malware family or represents a new breed of malware. This malware classification problem belongs to the data mining domain, and hence considerable research efforts have been made to apply machine learning techniques such as classification [3], [4], clustering [5], [6], Artificial Neural Networks (ANNs) [7], Hidden Markov Models (HMMs) [8], [9], etc. to solve this problem.…”
Section: Introductionmentioning
confidence: 99%
“…To the best of our knowledge, the only attempt at training DNNs on randomly projected data, and therefore the approach that is most relevant to our fixed-weight RP layers, was presented in [46]. Therein, Dahl et al used randomly projected data as input to networks trained for the malware classification task.…”
Section: Related Workmentioning
confidence: 99%