With the increasing usage of smartphones in banks, medical services and m-commerce, and the uploading of applications from unofficial sources, security has become a major concern for smartphone users. Malicious apps can steal passwords, leak details, and generally cause havoc with users' accounts. Current anti-virus programs rely on static signatures that need to be changed periodically and cannot identify zero-day malware. The Android permission system is the central security mechanism that regulates the execution of application tasks. Although recent advances in research have provided various approaches and detection methods for finding malware apps, the available literature lacks a full analysis of this subject. We fill this gap by: 1) Systematically and automatically building a large dataset of malware and benign apps, which we have made available to the community. Our dataset has around 16K apps and 118 features. 2) We offer a novel approach for automatically identifying permission usage patterns, which are groupings of permissions that developers frequently utilise together. The approach combines SOM and K-means clustering algorithms to classify permissions according to app usage categories. The results demonstrate that the proposed methodology is able to detect most of the consistent and coherent permission usage patterns across a wide variety of application categories. To assess our strategy, we add the identified patterns as features to our dataset and then apply an SVM classifier for malware detection. Our results indicate that the identified patterns improve the performance of the classifier.
As the leading mobile phone operating system, Android is an attractive target for malicious applications trying to exploit the system’s security vulnerabilities. Although several approaches have been proposed in the research literature for the detection of Android malwares, many of them suffer from issues such as small training datasets, there are few features (most studies are limited to permissions) that ultimately affect their performance. In order to address these issues, we propose an approach combining advanced machine learning techniques and Android vulnerabilities taken from the AndroVul dataset, which contains a novel combination of features for three different vulnerability levels, including dangerous permissions, code smells, and AndroBugs vulnerabilities. Our approach relies on that dataset to train Deep Learning (DL) and Support Vector Machine (SVM) models for the detection of Android malware. Our results show that both models are capable of detecting malware encoded in Android APK files with about 99% accuracy, which is better than the current state-of-the-art approaches.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.