Border Gateway Protocol (BGP) is a vital protocol on the internet for transfer of data packets among Autonomous System (AS). Security is a major concern for the transmission of BGP packets which are often attacked by worms or are hijacked by an attacker which results in requests entering black holes or loss of connection to the particular sites. The BGP anomalies can be reduced by analyzing the BGP datasets. Since, ASes communicate through messages, therefore, the anomalies can be reduced by identifying the corrupted BGP message in the dataset. In this paper, BGP anomalies have been classified by applying Machine learning (ML) algorithms. The dataset contains information about the sending and receiving time between ASes. The classifiers were used to predict the anomalies. Since the dataset had high dimensions, the dimensions were reduced using Linear Discriminant Analysis (LDA) and then Support Vector Machines (SVM), K-Nearest Neighbors (KNN), Linear Regression, Logistic Regression and Multi-Layer Perceptron (MLP) have been used to classify the anomalies.
Abstract: Border Gateway Protocol (BGP) is utilized to send and receive data packets over the internet. Over the years, this protocol has suffered from some massive hits, caused by worms, such as Nimda, Slammer, Code Red etc., hardware failures, and/or prefix hijacking. This caused obstruction of services to many. However, Identification of anomalous messages traversing over BGP allows discovering of such attacks in time. In this paper, a Machine Learning approach has been applied to identify such BGP messages. Principal Component Analysis technique was applied for reducing dimensionality up to 2 components, followed by generation of Decision Tree, Random Forest, AdaBoost and GradientBoosting classifiers. On fine tuning the parameters, the random forest classifier generated an accuracy of 97.84%, the decision tree classifier followed closely with an accuracy of 97.38%. The GradientBoosting Classifier gave an accuracy of 95.41% and the AdaBoost Classifier gave an accuracy of 94.43%.
Background:
Customer Segmentation is the process of dividing customers into groups based on some
demographic factors in order to get an idea of the targeted audience for a product and to best market said product.
Objective:
Sentiment Analysis on customer reviews is one way that this process can be enhanced to get not just
demographic information but subjective information and preferences as well.
Methods:
In this study, Long Short-Term Memory model, a deep learning technique has been applied for Sentiment
Analysis and its results have been used to perform Customer Segmentation on demographic data containing information
such as age and gender. Segmentation was performed using Spectral Clustering. Cluster Labels were extracted to perform
supervised classification using different supervised algorithms, such as Support Vector Machines, Random Forests,
Decision Trees and Logistic Regression.
Results:
An accuracy of 90.9% was achieved by the LSTM model. An accuracy of 100% was achieved by the Random
Forest and Decision Tree Classifiers.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.