Detecting web attacks using random undersampling and ensemble learners

Zuech, Richard; Hancock, John; Khoshgoftaar, Taghi M.

doi:10.1186/s40537-021-00460-8

Cited by 54 publications

(26 citation statements)

References 36 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…(iv) Random undersampling (RUS): this sampling method removes instances from the majority class to improve class imbalances toward the desired target classes. In [26,27], RUS is more successful than other sampling methods. Additionally, RUS has been used in other studies [28,29] to address the issue of class imbalance.…”

Section: Data Preprocessingmentioning

confidence: 99%

Apache Spark and Deep Learning Models for High-Performance Network Intrusion Detection Using CSE-CIC-IDS2018

Hagar

Gawali

2022

Computational Intelligence and Neuroscience

View full text Add to dashboard Cite

Keeping computers secure is becoming challenging as networks grow and new network-based technologies emerge. Cybercriminals’ attack surface expands with the release of new internet-enabled products. As many cyberattacks affect businesses’ confidentiality, availability, and integrity, network intrusion detection systems (NIDS) show an essential role. Network-based intrusion detection uses datasets like CSE-CIC-IDS2018 to train prediction models. With fourteen types of attacks included, the latest big data set for intrusion detection is available to the public. This work proposes three models, two deep learning convolutional neural networks (CNN), long short-term memory (LSTM), and Apache Spark, to improve the detection of all types of attacks. To reduce the dimensionality, random forests (RF) was employed to select the important features; it gave 19 from 84 features. The dataset is imbalanced; thus, oversampling and undersampling techniques reduce the imbalance ratio. The Apache Spark model produced the best results across all 15 classes, with accuracy as high as 100% for all classes, as seen by the experiments’ findings. For the F1-score, Apache Spark showed the highest results with 1.00 for most classes. The findings of the three models showed outstanding results for multiclassification network intrusion detection.

show abstract

Section: Data Preprocessingmentioning

confidence: 99%

Apache Spark and Deep Learning Models for High-Performance Network Intrusion Detection Using CSE-CIC-IDS2018

Hagar

Gawali

2022

Computational Intelligence and Neuroscience

View full text Add to dashboard Cite

show abstract

“…Class Imbalance is another critical attribute to be considered for building a novel and efficient architecture. Zuech et al (22) analyzed web attacks using random undersampling ratios under various ensemble learning algorithms and discussed Most of the research on Intrusion detection mechanisms was based on the ensemble learning approach. Fitni et al (24) made comparisons with seven single classifiers to identify the most appropriate basic classifiers for ensemble learning; they compared the accuracy metrics of tested architectures and made a cumulative study on it.…”

Section: Related Workmentioning

confidence: 99%

“…Class Imbalance is another critical attribute to be considered for building a novel and efficient architecture. Zuech et al ( 22 ) analyzed web attacks using random undersampling ratios under various ensemble learning algorithms and discussed class balance's significant importance. They observed that undersampling at different ratios could have a drastic effect on the model's performance and achieved an accuracy of about 94.01% accuracy on CIC IDS 2018 dataset using RCNN.…”

Section: Related Workmentioning

confidence: 99%

A Hybrid Framework for Intrusion Detection in Healthcare Systems Using Deep Learning

Kumaar¹,

Samiayya

Vincent

et al. 2022

Front. Public Health

View full text Add to dashboard Cite

The unbounded increase in network traffic and user data has made it difficult for network intrusion detection systems to be abreast and perform well. Intrusion Systems are crucial in e-healthcare since the patients' medical records should be kept highly secure, confidential, and accurate. Any change in the actual patient data can lead to errors in the diagnosis and treatment. Most of the existing artificial intelligence-based systems are trained on outdated intrusion detection repositories, which can produce more false positives and require retraining the algorithm from scratch to support new attacks. These processes also make it challenging to secure patient records in medical systems as the intrusion detection mechanisms can become frequently obsolete. This paper proposes a hybrid framework using Deep Learning named “ImmuneNet” to recognize the latest intrusion attacks and defend healthcare data. The proposed framework uses multiple feature engineering processes, oversampling methods to improve class balance, and hyper-parameter optimization techniques to achieve high accuracy and performance. The architecture contains <1 million parameters, making it lightweight, fast, and IoT-friendly, suitable for deploying the IDS on medical devices and healthcare systems. The performance of ImmuneNet was benchmarked against several other machine learning algorithms on the Canadian Institute for Cybersecurity's Intrusion Detection System 2017, 2018, and Bell DNS 2021 datasets which contain extensive real-time and latest cyber attack data. Out of all the experiments, ImmuneNet performed the best on the CIC Bell DNS 2021 dataset with about 99.19% accuracy, 99.22% precision, 99.19% recall, and 99.2% ROC-AUC scores, which are comparatively better and up-to-date than other existing approaches in classifying between requests that are normal, intrusion, and other cyber attacks.

show abstract

“…It is well-suited for dealing with categorical features and can also handle missing values [29]. Furthermore, it is insensitive to the order of categorical features, which makes it robust to potential data leakage [30].…”

Section: Catboostmentioning

confidence: 99%

Supply Chain Fraud Prediction with Machine Learning and Artificial intelligence

Lokanan

Maddhesia

2022

Preprint

View full text Add to dashboard Cite

The increasing complexity of supply chains is putting pressure on businesses to find new ways to optimize efficiency and cut costs. One area that has seen a lot of recent development is machine learning (ML) and artificial intelligence (AI) to help manage supply chains. This paper employs machine learning (ML) and artificial intelligence (AI) algorithms to predict fraud in the supply chain. Supply chain data for this project was retrieved from real-world business transactions. The findings show that ML and AI classifiers did an excellent job predicting supply chain fraud. In particular, the AI model was the highest predictor across all performance measures. These results suggest that computational intelligence can be a powerful tool for detecting and preventing supply chain fraud. ML and AI classifiers can analyze vast amounts of data and identify patterns that may evade manual detection. The findings presented in this paper can be used to optimize supply chain management (SCM) and make predictions of fraudulent transactions before they occur. While ML and AI classifiers are still in the early stages of development, they have the potential to revolutionize SCM. Future research should explore how these techniques can be refined and applied to other domains.

show abstract

Detecting web attacks using random undersampling and ensemble learners

Cited by 54 publications

References 36 publications

Apache Spark and Deep Learning Models for High-Performance Network Intrusion Detection Using CSE-CIC-IDS2018

Apache Spark and Deep Learning Models for High-Performance Network Intrusion Detection Using CSE-CIC-IDS2018

A Hybrid Framework for Intrusion Detection in Healthcare Systems Using Deep Learning

Supply Chain Fraud Prediction with Machine Learning and Artificial intelligence

Contact Info

Product

Resources

About