2020
DOI: 10.1109/access.2020.3009843
|View full text |Cite
|
Sign up to set email alerts
|

CICIDS-2017 Dataset Feature Analysis With Information Gain for Anomaly Detection

Abstract: Feature selection (FS) is one of the important tasks of data preprocessing in data analytics. The data with a large number of features will affect the computational complexity, increase a huge amount of resource usage and time consumption for data analytics. The objective of this study is to analyze relevant and significant features of huge network traffic to be used to improve the accuracy of traffic anomaly detection and to decrease its execution time. Information Gain is the most feature selection technique… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
103
0
4

Year Published

2020
2020
2023
2023

Publication Types

Select...
8
2

Relationship

1
9

Authors

Journals

citations
Cited by 253 publications
(107 citation statements)
references
References 48 publications
0
103
0
4
Order By: Relevance
“…The algorithm was used by the authors of the CIC-IDS-2017 dataset (Sharafaldin et al, 2018), and also by Kurniabudi et al (2020). Sharafaldin et al (2018) obtained results for the weighted averages of Precision, Recall and F 1 of 0.98, 0.97, and 0.97.…”
Section: Random Forest Trees (Rft)mentioning
confidence: 99%
“…The algorithm was used by the authors of the CIC-IDS-2017 dataset (Sharafaldin et al, 2018), and also by Kurniabudi et al (2020). Sharafaldin et al (2018) obtained results for the weighted averages of Precision, Recall and F 1 of 0.98, 0.97, and 0.97.…”
Section: Random Forest Trees (Rft)mentioning
confidence: 99%
“…Typically, given these considerations, one performs k-fold cross validation using k = 5 or k = 10. These values have been shown empirically to yield test error rate estimates that suffer neither from excessively high bias nor from very high variance [70]. Fig.…”
Section: Figure 10 Detection Accuracy Ensemble With J48 On the Best mentioning
confidence: 87%
“…The IGD method is modified from information gain and combined with the concept of objective distance [17]- [18]. Information gain is known as one of the most utilized attribute selection methods to choose irrelevant attributes [30]- [31]. This method generally determines the attribute importance by measuring reductions in entropy, introduced by Shannon in 1948 [32].…”
Section: Literature Reviewmentioning
confidence: 99%