Recently, some graph-based methods have been proposed for malware detection. However, current malware is generally characterized by sophisticated behaviors, which makes graph-based malware detection extremely challenging. To address this issue, we propose a graph repartition algorithm by transforming API call graphs into fragment behaviors based on programs’ dynamic execution traces. The proposed algorithm relies on the N-order subgraph (NSG) for constructing the appropriate fragment behavior. Moreover, we improve the term frequency-inverse document frequency- (TF-IDF-) like measure and information gain (IG) to extract the crucial N-order subgraph (CNSG). This novel behavioral representation and improved extraction method can accurately represent crucial behaviors of malware. Experiments on 4,400 samples demonstrate that the proposed method achieves a high accuracy of 99.75% in malware detection and promising performance of 95.27% in malware classification.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.