2024
DOI: 10.1186/s40537-024-00886-w
|View full text |Cite
|
Sign up to set email alerts
|

Machine learning-based network intrusion detection for big and imbalanced data using oversampling, stacking feature embedding and feature extraction

Md. Alamin Talukder,
Md. Manowarul Islam,
Md Ashraf Uddin
et al.

Abstract: Cybersecurity has emerged as a critical global concern. Intrusion Detection Systems (IDS) play a critical role in protecting interconnected networks by detecting malicious actors and activities. Machine Learning (ML)-based behavior analysis within the IDS has considerable potential for detecting dynamic cyber threats, identifying abnormalities, and identifying malicious conduct within the network. However, as the number of data grows, dimension reduction becomes an increasingly difficult task when training ML … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2024
2024
2025
2025

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 33 publications
(3 citation statements)
references
References 67 publications
0
3
0
Order By: Relevance
“…Fig. 5 shows the ROC curve, which integrates response sensitivity and ongoing specificity variables, potentially revealing the relationship between these two metrics [36,42]. It displays classifier performance in Binary Classification Tasks derived from computing the True Positive Rate (TPR) and False Positive Rate (FPR) across various thresholds.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…Fig. 5 shows the ROC curve, which integrates response sensitivity and ongoing specificity variables, potentially revealing the relationship between these two metrics [36,42]. It displays classifier performance in Binary Classification Tasks derived from computing the True Positive Rate (TPR) and False Positive Rate (FPR) across various thresholds.…”
Section: Resultsmentioning
confidence: 99%
“…This ensemble approach increases accuracy and provides a reliable way to handle large datasets efficiently. RF has a fast training speed, requires minimal parameter tuning, and effectively addresses the challenges of overfitting by exploiting variability among trees to produce more stable and accurate predictions [36].…”
Section: Random Forestmentioning
confidence: 99%
“…Another crucial function is associated with preserving the integrity of each of the data chunks. The data would not be reconstructed even if a single data chunk was tampered with by any of the storage nodes [13,14]. Moreover, the issues related to the processing time of service requests are also crucial.…”
Section: Introductionmentioning
confidence: 99%