Generic computer virus detection is the need of the hour as most commercial antivirus software fail to detect unknown and new viruses. Motivated by the success of datamining/machine learning techniques in intrusion detection systems, recent research in detecting malicious executables is directed towards devising efficient non-signature-based techniques that can profile the program characteristics from a set of training examples. Byte sequences and byte n-grams are considered to be basis of feature extraction. But as the number of n-grams is going to be very large, several methods of feature selections were proposed in literature. A recent report on use of information gain based feature selection has yielded the best-known result in classifying malicious executables from benign ones. We observe that information gain models the presence of n-gram in one class and its absence in the other. Through a simple example we show that this may lead to erroneous results. In this paper, we describe a new feature selection measure, class-wise document frequency of byte n-grams. We empirically demonstrate that the proposed method is a better method for feature selection. For detection, we combine several classifiers using Dempster Shafer Theory for better classification accuracy instead of using any single classifier. Our experimental results show that such a scheme detects virus program far more efficiently than the earlier known methods.
Target coverage problem in wireless sensor networks is concerned with maximizing the lifetime of the network while continuously monitoring a set of targets. A sensor covers targets which are within the sensing range. For a set of sensors and a set of targets, the sensor-target coverage relationship is assumed to be known. A sensor cover is a set of sensors that covers all the targets. The target coverage problem is to determine a set of sensor covers with maximum aggregated lifetime while constraining the life of each sensor by its initial battery life. The problem is proved to be NP-complete and heuristic algorithms to solve this problem are proposed. In the present study, we give a unified interpretation of earlier algorithms and propose a new and efficient algorithm. We show that all known algorithms are based on a common reasoning though they seem to be derived from different algorithmic paradigms. We also show that though some algorithms guarantee bound on the quality of the solution, this bound is not meaningful and not practical too. Our interpretation provides a better insight to the solution techniques. We propose a new greedy heuristic which prioritizes sensors on residual battery life. We show empirically that the proposed algorithm outperforms all other heuristics in terms of quality of solution. Our experimental study over a large set of randomly generated problem instances also reveals that a very naïve greedy approach yields solutions which is reasonably (appx. 10%) close to the actual optimal solutions.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.