Abstract. We propose a novel two step dimensionality reduction approach based on correlation using machine learning techniques for identifying unseen malicious Executable Linkable Files (ELF). System calls used as features are dynamically extracted in a sandbox environment. The extended version of symmetric uncertainty (X-SU) proposed by us, ranks feature by determining Feature-Class correlation using entropy, information gain and further eliminate the redundant features by estimating Feature-Feature correlation using weighted probabilistic information gain. Three learning algorithms (J48, Adaboost and Random Forest) are employed to generate prediction models, from the system call traces. Optimal feature vector constructed using minimum feature length (27 no.) resulted in over all classification accuracy of 99.40% with very less false alarm to identify unknown malicious specimens.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.