Abstract-Terrorist attacks are the biggest challenging problem for the mankind across the world, which need the wholly attention of the researchers, practitioners to cope up deliberately. To predict the terrorist group which is responsible of attacks and activities using historical data is a complicated task due to the lake of detailed terrorist data. This research based on predicting terrorist groups responsible of attacks in Egypt from year 1970 up to 2013 by using data mining classification technique to compare five base classifiers namely; Naïve Bayes (NB), K-Nearest Neighbour (KNN), Tree Induction (C4.5), Iterative Dichotomiser (ID3), and Support Vector Machine (SVM) depend on real data represented by Global terrorism Database (GTD) from National Consortium for the study of terrorism and Responses of Terrorism (START). The goal of this research is to present two different approaches to handle the missing data as well as provide a detailed comparative study of the used classification algorithms and evaluate the obtained results via two different test options. Experiments are performed on real-life data with the help of WEKA and the final evaluation and conclusion based on four performance measures which showed that SVM, is more accurate than NB and KNN in mode imputation approach, ID3 has the lowest classification accuracy although it performs well in other measures, and in Litwise deletion approach; KNN outperformed the other classifiers in its accuracy, but the overall performance of SVM is acceptable than other classifiers.Index Terms-KDD, precision, recall, terrorist group.
I. INTRODUCTIONTerrorist attacks are biggest, challenging, and leading issue in the whole world. It is one of the central points of concentration in all governments. Data mining is popularly known as Knowledge Discovery in Databases (KDD), it is a logical process of discovering new patterns from large data sets involving methods combined with statistics, database systems, support vector machine, artificial intelligence, meta-heuristics, and machine learning. The main goal of data mining is to extract useful, hidden predictive knowledge from large data sets in a human understandable structure and involves database, data management and pre-processing tools, model and interface capabilities, post-processing of discovered structure, visualization and online updating methods for finding hidden patterns, and predictive information that expert may miss because it lies outside their expectations [1], [2]. Data mining and automated data analysis techniques have become used as effective branch of the most important key features for many applications, data mining has a wide number of applications ranging from marketing and advertising of goods, services or products, artificial intelligence research, biological sciences, crime investigations to high-level government intelligence [3]. Recently there has been much concern on using data mining in detecting and investigating unusual patterns, crimes, terrorist activities and preventing the fraudulent...