2020
DOI: 10.25046/aj050566
|View full text |Cite
|
Sign up to set email alerts
|

Malware classification using XGboost-Gradient Boosted Decision Tree

Abstract: In this industry 4.0 and digital era, we are more dependent on the use of communication and various transaction such as financial, exchange of information by various means. These transaction needs to be secure. Differentiation between the use of benign and malware is one way to make these transactions secure. We propose in this work a malware classification scheme that constructs a model using low-end computing resources and a very large balanced dataset for malware. To our knowledge, and search the complete d… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0
1

Year Published

2022
2022
2024
2024

Publication Types

Select...
7
2
1

Relationship

2
8

Authors

Journals

citations
Cited by 31 publications
(8 citation statements)
references
References 33 publications
0
7
0
1
Order By: Relevance
“…Different types of analysis based approaches have been suggested in the literature for determining malware categories [10]. Kumar et al [21] use the XGboost model for malware detection. It uses the Ember dataset in which there are 300k malicious and 300k non-malicious instances.…”
Section: Literature Reviewmentioning
confidence: 99%
“…Different types of analysis based approaches have been suggested in the literature for determining malware categories [10]. Kumar et al [21] use the XGboost model for malware detection. It uses the Ember dataset in which there are 300k malicious and 300k non-malicious instances.…”
Section: Literature Reviewmentioning
confidence: 99%
“…Kumar and Geetha [23] proposed a malware classification scheme that constructs a model using low-end computing resources and a very large balanced dataset-the EMBER dataset, which consists of 1.1 million entries-for malware. The authors compared the performance of nine algorithms: Gaussian NB, KNN, linear support vector classification (SVC), DT, AdaBoost, RF, extra trees, gradient boost (GDB), and XGBoost.…”
Section: Reference Workmentioning
confidence: 99%
“…In static analysis, the features of malware may be extracted from the PE header [ 12 ] or the Application Program Interface (API) calls from the loaded dynamic link library (DLL) [ 13 ]. Features for static analysis can also be extracted from software files, such as histogram of bytes in the sample, the entropy of parts of the sample file, and printable strings with more than five characters embedded in the sample file [ 14 ]. Raff et al [ 15 ] used n-grams from byte code for static analysis.…”
Section: Literature Surveymentioning
confidence: 99%