Abstract--K-means is a popular clustering algorithm that requires a huge initial set to start the clustering. K-means is an unsupervised clustering method which does not guarantee convergence. Numerous improvements to K-means have been done to make its performance better. Expectation Maximization is a statistical technique for maximum likelihood estimation using mixture models. It searches for a local maxima and generally converges very well. The proposed algorithm combines these two algorithms to generate optimum clusters which do not require a huge value of K and each cluster attains a more natural shape and guarantee convergence. The paper compares the new method with Fuzzy K-means on benchmark iris data.
Intrusion detection systems have undergone numerous years of study and yet a great deal of problems remain; primarily a high percentage of false alarms and abysmal detection rates. A new type of threat has emerged that of Advanced Persistent Threat. This type of attack is known for being sophisticated and slow moving over a long period of time and is found in networked systems. Such threats may be detected by evaluation of large numbers of state variables describing complex system operation and state transitions over time. Analysis of such large numbers of variables is computationally inefficient especially if it is meant to be done in real time. The paper develops a completely new theoretical model that appears to be able to distill high order state variable data sets down to the essence of analytic changes in a system with APT operating. The model is based on the computationally efficient use of integer vectors. This approach has the capability to analyze threat over time, and has potential to detect, predict and classify new threat as being similar to threat already detected. The model presented is highly theoretical at this point with some initial prototype work demonstrated and some initial performance data.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.