Prevalence and Impact of Low-Entropy Packing Schemes in the Malware Ecosystem

Mantovani, Alessandro; Aonzo, Simone; Ugarte-Pedrero, Xabier; Merlo, Alessio; Balzarotti, Davide

doi:10.14722/ndss.2020.24297

Cited by 24 publications

(16 citation statements)

References 32 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Clearly, such techniques also require a fully supervised training set composed by well tagged benign and malicious samples, so as to build and train the classifiers to categorize samples as malicious or benign. For this reason, we only selected samples that were identified to be malicious by at least 5 AV detection on VirusTotal-a rather conservative solution compared with the threshold used by other works [60]. Moreover, in contrast to many studies that selected benign samples by picking popular Windows applications or installation files, which in general are very well-known files and therefore easy to spot and whitelist by the security companies, we assembled our benign dataset from VirusTotal submissions.…”

Section: Sample Selectionmentioning

confidence: 99%

Does Every Second Count? Time-based Evolution of Malware Behavior in Sandboxes

Alexander¹,

Mantovani²,

Han

et al. 2021

Proceedings 2021 Network and Distributed System Security Symposium

Self Cite

View full text Add to dashboard Cite

The amount of time in which a sample is executed is one of the key parameters of a malware analysis sandbox. Setting the threshold too high hinders the scalability and reduces the number of samples that can be analyzed in a day; too low and the samples may not have the time to show their malicious behavior, thus reducing the amount and quality of the collected data. Therefore, an analyst needs to find the 'sweet spot' that allows to collect only the minimum amount of information required to properly classify each sample. Anything more is wasting resources, anything less is jeopardizing the experiments. Despite its importance, there are no clear guidelines on how to choose this parameter, nor experiments that can help companies to assess the pros and cons of a choice over another. To fill this gap, in this paper we provide the first large-scale study of the impact that the execution time has on both the amount and the quality of the collected events. We measure the evolution of system calls and code coverage, to draw a precise picture of the fraction of runtime behavior we can expect to observe in a sandbox. Finally, we implemented a machine learning based malware detection method, and applied it to the data collected in different time windows, to also report on the relevance of the events observed at different points in time. Our results show that most samples run for either less than two minutes or for more than ten. However, most of the behavior (and 98% of the executed basic blocks) are observed during the first two minutes of execution, which is also the time windows that result in a higher accuracy of our ML classifier. We believe this information can help future researchers and industrial sandboxes to better tune their analysis systems.

show abstract

Section: Sample Selectionmentioning

confidence: 99%

Does Every Second Count? Time-based Evolution of Malware Behavior in Sandboxes

Alexander¹,

Mantovani²,

Han

et al. 2021

Proceedings 2021 Network and Distributed System Security Symposium

Self Cite

View full text Add to dashboard Cite

show abstract

“…Table 4 shows the result of the error rate of dataset1. We also compared our results with the results of Mantovani et al [3]. Parameter w indicates the vectors of all features, and parameter w' indicates the vectors of all features except the entropy-related features.…”

Section: Resultsmentioning

confidence: 97%

“…We used two datasets in our experiments: one was acquired from [3], and the other was collected from VirusTotal; these are termed as dataset1 and dataset2, respectively. The samples collected from VirusTotal are all ransomware, and the collecting period is between May 29 and June 30, 2020.…”

Section: Datasetmentioning

confidence: 99%

“…For dataset1, we adopted the labels used by Mantovani et al [3]. The training set of dataset1 comprises 7,500 packed and unpacked samples each.…”

Section: Data Labellingmentioning

confidence: 99%

“…Table 2 shows the best detection result of the error rate of the machine learning model with entropy-related features in the research of Mantovani et al [3]. They attempted to determine the effect of low-entropy packed samples on machine learning.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

An Experience in Enhancing Machine Learning Classifier Against Low-Entropy Packed Malwares

Chen¹,

Chuang²,

Tien³

et al. 2021

Computer Science &Amp; Information Technology (CS &Amp; IT)

View full text Add to dashboard Cite

Both benign applications and malwares would take packing for their different purposes to conceal the real part of the program processes. According to recent research reports, existing machine learning (ML) approach-based malware detection engines are difficult to effectively classify the packed malwares, especially when they are in low entropy packed. Recently, we counted and found that the ratio of low-entropy packed ransomware is extremely high. This would cause a high error rate of the result on currently used ML approaches. Thus, we propose a new method to extract entropy-related features and use a stack model to build up an ML malware engine to effectively detect low-entropy packed malwares. We evaluate our method by using over 15,000 malware samples collected from VirusTotal and compare the result to related researches. This experience reports our adopted model and features can significantly lower the error rate of low-entropy packed detection from 11% to 1%.

show abstract