2017 IEEE Conference on Dependable and Secure Computing 2017
DOI: 10.1109/desec.2017.8073860
|View full text |Cite
|
Sign up to set email alerts
|

Development of a network intrusion detection system using Apache Hadoop and Spark

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
11
0

Year Published

2018
2018
2022
2022

Publication Types

Select...
6
3

Relationship

0
9

Authors

Journals

citations
Cited by 27 publications
(11 citation statements)
references
References 20 publications
0
11
0
Order By: Relevance
“…The Pipeline will create the workflow of the algorithm and the implementation done based on the ordering of passed variables that it means first the StringIndexer, second Vec-torAssembler, third RandomForest, fourth In-dexToString and then the model training on the training dataset after that make a prediction on the testing dataset. Kato and Klyuev [24] proposed the anomaly detection system with Apache Spark and Hadoop and by use of Hive table and unsupervised learning algorithm like K-means and also GMM algorithm, this system capable of managing and detecting an enormous dataset about 90 GB quickly with low rate of false alarm and high value about 86% of accuracy. Gupta and Kulariya [25] proposed a framework for intrusion detection system based on Apache Spark, they used feature selections as correlation based and chi-squared with different algorithms such as Random Forest, Logistic Regression and other algorithms and evaluated the performance of each algorithm on NSL-KDD and KDD'99.…”
Section: Analysis Of Empirical Resultsmentioning
confidence: 99%
“…The Pipeline will create the workflow of the algorithm and the implementation done based on the ordering of passed variables that it means first the StringIndexer, second Vec-torAssembler, third RandomForest, fourth In-dexToString and then the model training on the training dataset after that make a prediction on the testing dataset. Kato and Klyuev [24] proposed the anomaly detection system with Apache Spark and Hadoop and by use of Hive table and unsupervised learning algorithm like K-means and also GMM algorithm, this system capable of managing and detecting an enormous dataset about 90 GB quickly with low rate of false alarm and high value about 86% of accuracy. Gupta and Kulariya [25] proposed a framework for intrusion detection system based on Apache Spark, they used feature selections as correlation based and chi-squared with different algorithms such as Random Forest, Logistic Regression and other algorithms and evaluated the performance of each algorithm on NSL-KDD and KDD'99.…”
Section: Analysis Of Empirical Resultsmentioning
confidence: 99%
“…The analysis of some existing data sets (UNB-ISCX-2012 [30], CTU-13 [31], MACCDC [32] or UGR'16 [33]) allows us to observe that they have different formats and feature, so that we can say that cybersecurity data sets are highly heterogeneous.…”
Section: Categorization Of a Cybersecurity Data Setmentioning
confidence: 99%
“…Nowadays, there exist different cybersecurity datasets that can be used for IDS based ML experimentation, i.e., UNB-ISCX-1012 [26], CTU-13 [27], MACCDC [28], UGR-16 [29], CICDS [30], KDD-99, NSL-KDD [31], or UNSW-NB15 [32]. Some of them have been widely used, like for instance the dataset KDD-99, which has been stablished as the main benchmark dataset for the different studies cases in the application of ML-based IDS.…”
Section: Cybersecurity Datasetsmentioning
confidence: 99%