<div>The Android Operating System (OS) everywhere, computers, cars, homes, and, of course, personal and corporate smartphones. A recent survey from the International Data Corporation (IDC) reveals that the Android platform holds 85% of the smartphone market share. Its popularity and open nature make it an attractive target for malware. According to AV-TEST, by November 2020, 2.87M new Android malware instances were identified in the wild. Malware detection is a challenging problem that has been actively explored by both the industry and academia using intelligent methods. On the one hand, traditional machine learning (ML) malware detection methods rely on manual feature engineering that requires expert knowledge. On the other hand, deep learning (DL) malware detection methods perform automatic feature extraction but usually require much more data and processing power. In this work, we propose a new multimodal DL Android malware detection method, Chimera, that combines both manual and automatic feature engineering by using the DL architectures, Convolutional Neural Networks (CNN), Deep Neural Networks (DNN), and Transformer Networks (TN) to perform feature learning from raw data (Dalvik Executable (DEX) grayscale images), static analysis data (Android Intents & Permissions), and dynamic analysis data (system call sequences) respectively. To train and evaluate our model, we implemented the Knowledge Discovery in Databases (KDD) process and used the publicly available Android benchmark dataset Omnidroid, which contains static and dynamic analysis data extracted from 22,000 real malware and goodware samples. By leveraging a hybrid source of information to learn high-level feature representations for both the static and dynamic properties of Android applications, Chimera’s detection Accuracy, Precision, Recall, and ROC AUC outperform classical ML algorithms, state-of-the-art Ensemble, and Voting Ensembles ML methods, as well as unimodal DL methods using CNNs, DNNs, TNs, and Long-Short Term Memory Networks (LSTM). To the best of our knowledge, this is the first work that successfully applies multimodal DL to combine those three different modalities of data using DNNs, CNNs, and TNs to learn a shared representation that can be used in Android malware detection tasks.</div>
In this work, we propose a new multimodal Deep Learning (DL) Android malware detection method, Chimera, that combines both manual and automatic feature engineering by using the DL architectures, Convolutional Neural Networks (CNN), Deep Neural Networks (DNN), and Transformer Networks (TN) to perform feature learning from raw data (Dalvik Executables (DEX)), static analysis data (Android Intents & Permissions), and dynamic analysis data (system call sequences) respectively. To train and evaluate our model, we implemented the Knowledge Discovery in Databases (KDD) process and used the publicly available Android benchmark dataset Omnidroid. By leveraging a hybrid source of information to learn high-level feature representations for both the static and dynamic properties of Android applications, Chimera’s detection Accuracy, Precision, and Recall outperform classical Machine Learning (ML) algorithms, state-of-the-art Ensemble, and Voting Ensembles ML methods, as well as unimodal DL methods using CNNs, DNNs, TNs, and Long-Short Term Memory Networks (LSTM). To the best of our knowledge, this is the first work that successfully applies multimodal DL to combine those three different modalities of data using DNNs, CNNs, and TNs to learn a shared representation that can be used in Android malware detection tasks.
<div>Malware behavioral graphs provide a rich source of information that can be leveraged for detection and classification tasks. In this paper, we propose a novel behavioral malware detection method based on Deep Graph Convolutional Neural Networks (DGCNNs) to learn directly from API call sequences and their associated behavioral graphs. In order to train and evaluate the models, we created a new public domain dataset of more than 40,000 API call sequences resulting from the execution of malware and goodware instances in a sandboxed environment. Experimental results show that our models achieve similar Area Under the ROC Curve (AUC-ROC) and F1-Score to Long-Short Term Memory (LSTM) networks, widely used as the base architecture for behavioral malware detection methods, thus indicating that the models can effectively learn to distinguish between malicious and benign temporal patterns through convolution operations on graphs. To the best of our knowledge, this is the first paper that investigates the applicability of DGCNN to behavioral malware detection using API call sequences.</div>
<div>Malware behavioral graphs provide a rich source of information that can be leveraged for detection and classification tasks. In this paper, we propose a novel behavioral malware detection method based on Deep Graph Convolutional Neural Networks (DGCNNs) to learn directly from API call sequences and their associated behavioral graphs. In order to train and evaluate the models, we created a new public domain dataset of more than 40,000 API call sequences resulting from the execution of malware and goodware instances in a sandboxed environment. Experimental results show that our models achieve similar Area Under the ROC Curve (AUC-ROC) and F1-Score to Long-Short Term Memory (LSTM) networks, widely used as the base architecture for behavioral malware detection methods, thus indicating that the models can effectively learn to distinguish between malicious and benign temporal patterns through convolution operations on graphs. To the best of our knowledge, this is the first paper that investigates the applicability of DGCNN to behavioral malware detection using API call sequences.</div>
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.