A Novel Malware Classification Method Based on Crucial Behavior

Xiao, Fei; Sun, Yi; Du, Donggao; Li, Xuelei; Luo, Min

doi:10.1155/2020/6804290

Cited by 11 publications

(8 citation statements)

References 36 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Moreover, to mitigate the shortcomings of the above-mentioned feature representation techniques, several studies such as Belaoued et al (2019) and Ali et al (2020) selected the most important features based on the weights that were calculated using the traditional TF-IDF technique, while ( Li et al, 2020a ) used TF-IDF technique to select and represent the proposed feature set. Other studies ( Xue et al, 2019 ; Xiao et al, 2020 ; Al-Rimy et al, 2020 ; Qin, Zhang & Chen, 2021 ) developed the traditional TF-IDF to propose enhanced TF-IDF techniques by which the obtained features were represented using more accurate weights. Xue et al (2019) proposed a malware classification model that connected a convolutional neural network (CNN) trained on static features and the random forest (RF) trained on dynamic features via a probability scoring threshold.…”

Section: Related Workmentioning

confidence: 99%

“…A well-known TF-IDF technique is imported from the information retrieval field and used for representation purposes by several malware detection researchers ( Zhang et al, 2019 ; Ali et al, 2020 ; Li et al, 2020a ; Li et al, 2020b ) to represent the extracted features in the form of weight-based vectors. Furthermore, several studies ( Wang & Zhang, 2013 ; Xue et al, 2019 ; Xiao et al, 2020 ; Al-Rimy et al, 2020 ; Qin, Zhang & Chen, 2021 ) have been carried out to develop various feature representation techniques by enhancing the concept of the traditional TF-IDF technique and boost its capability to accurately represent the extracted feature. However, the primary principle of these techniques has been built based on the main concept of the traditional TF-IDF technique, by which the probability distributions of the features in each class are not considered when the IDF is calculated.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

A Kullback-Liebler divergence-based representation algorithm for malware detection

Aboaoja,

Zainal,

Ghaleb

et al. 2023

PeerJ Computer Science

View full text Add to dashboard Cite

Background Malware, malicious software, is the major security concern of the digital realm. Conventional cyber-security solutions are challenged by sophisticated malicious behaviors. Currently, an overlap between malicious and legitimate behaviors causes more difficulties in characterizing those behaviors as malicious or legitimate activities. For instance, evasive malware often mimics legitimate behaviors, and evasion techniques are utilized by legitimate and malicious software. Problem Most of the existing solutions use the traditional term of frequency-inverse document frequency (TF-IDF) technique or its concept to represent malware behaviors. However, the traditional TF-IDF and the developed techniques represent the features, especially the shared ones, inaccurately because those techniques calculate a weight for each feature without considering its distribution in each class; instead, the generated weight is generated based on the distribution of the feature among all the documents. Such presumption can reduce the meaning of those features, and when those features are used to classify malware, they lead to a high false alarms. Method This study proposes a Kullback-Liebler Divergence-based Term Frequency-Probability Class Distribution (KLD-based TF-PCD) algorithm to represent the extracted features based on the differences between the probability distributions of the terms in malware and benign classes. Unlike the existing solution, the proposed algorithm increases the weights of the important features by using the Kullback-Liebler Divergence tool to measure the differences between their probability distributions in malware and benign classes. Results The experimental results show that the proposed KLD-based TF-PCD algorithm achieved an accuracy of 0.972, the false positive rate of 0.037, and the F-measure of 0.978. Such results were significant compared to the related work studies. Thus, the proposed KLD-based TF-PCD algorithm contributes to improving the security of cyberspace. Conclusion New meaningful characteristics have been added by the proposed algorithm to promote the learned knowledge of the classifiers, and thus increase their ability to classify malicious behaviors accurately.

show abstract

Section: Related Workmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

A Kullback-Liebler divergence-based representation algorithm for malware detection

Aboaoja,

Zainal,

Ghaleb

et al. 2023

PeerJ Computer Science

View full text Add to dashboard Cite

show abstract

“…It is challenging to address all types of threats with the same strategy because each type needs its defense strategy like antivirus, firewalls, algorithms, etc. ( Xiao et al, 2020 ). It is a severe problem for e-commerce ( Kim et al, 2018 ).…”

Section: Literature Reviewmentioning

confidence: 99%

“…The number of attacks through malware is a serious threat to e-commerce as the number of attacks is increasing yearly by a significant proportion. There were 670,000,000 malware variants in 2017, almost double the number in 2016 ( Xiao et al, 2020 ).…”

Section: Literature Reviewmentioning

confidence: 99%

Cyber security threats: A never-ending challenge for e-commerce

Liu

Ahmad²,

Anser³

et al. 2022

Front. Psychol.

View full text Add to dashboard Cite

This study explores the challenge of cyber security threats that e-commerce technology and business are facing. Technology applications for e-commerce are attracting attention from both academia and industry. It has made what was not possible before for the business community and consumers. But it did not come all alone but has brought some challenges, and cyber security challenge is one of them. Cyber security concerns have many forms, but this study focuses on social engineering, denial of services, malware, and attacks on personal data. Firms worldwide spend a lot on addressing cybersecurity issues, which grow each year. However, it seems complicated to overcome the challenge because the attackers continuously search for new vulnerabilities in humans, organizations, and technology. This paper is based on the conceptual analysis of social engineering, denial of services, malware, and attacks on personal data. We argue that implementing modern technology for e-commerce and cybersecurity issues is a never-ending game of cat and mouse. To reduce risks, reliable technology is needed, training of employees and consumer is necessary for using the technology, and a strong policy and regulation is needed at the firm and governmental level.

show abstract

“…In [8], the confused malware is detected by proper hook installation and real calculation of malware activity time in user and kernel. In [9], a graph repartitioning algorithm that uses the N-order subgraph (NSG) to convert API call graphs into fragment behaviors is proposed for malware detection and family classification. Besides, the "term frequency-inverse document frequency" (TF-IDF) and information gain (IG) were improved and used to extract thecrucial N-order subgraph (CNSG).…”

Section: Introductionmentioning

confidence: 99%

A Novel Malware Detection and Family Classification Scheme for IoT Based on DEAM and DenseNet

Wang

Zhao

Wang

et al. 2021

Security and Communication Networks

View full text Add to dashboard Cite

With the rapid increase in the amount and type of malware, traditional methods of malware detection and family classification for IoT applications through static and dynamic analysis have been greatly challenged. In this paper, a new simple and effective attention module of Convolutional Neural Networks (CNNs), named as Depthwise Efficient Attention Module (DEAM), is proposed and combined with a DenseNet to propose a new malware detection and family classification model. Based on the good effect of the DenseNet in the field of image classification and the visual similarity of the malware family on images, the gray-scale image transformed from malware is input into the model combined with the DEAM and DenseNet for malware detection, and then the family classification is carried out. The DEAM is a general lightweight attention module improved based on the Convolutional Block Attention Module (CBAM), which can strengthen the attention to the characteristics of malware and improve the model effect. We use the MalImg dataset, Microsoft malware classification challenge dataset (BIG 2015), and our dataset constructed by the two above-mentioned datasets to verify the effectiveness of the proposed model in family classification and malware detection. Experimental results show that the proposed model achieves 99.3% in terms of accuracy for malware detection on our dataset and achieves 98.5% and 97.3% in terms of accuracy for family classification on the MalImg dataset and BIG 2015 dataset, respectively. The model can reliably detect IoT malware and classify its families.

show abstract

A Novel Malware Classification Method Based on Crucial Behavior

Cited by 11 publications

References 36 publications

A Kullback-Liebler divergence-based representation algorithm for malware detection

A Kullback-Liebler divergence-based representation algorithm for malware detection

Cyber security threats: A never-ending challenge for e-commerce

A Novel Malware Detection and Family Classification Scheme for IoT Based on DEAM and DenseNet

Contact Info

Product

Resources

About