Christopher Nixon scite author profile

Practical Application of Machine Learning based Online Intrusion Detection to Internet of Things Networks

Nixon

¹

,

Sedky

²

,

Hassan

³

2019

View full text Add to dashboard Cite

Internet of Things (IoT) devices participate in an open and distributed perception layer, with vulnerability to cyber attacks becoming a key concern for data privacy and service availability. The perception layer provides a unique challenge for intrusion detection where resources are constrained and networks are distributed. An additional challenge is that IoT networks are a continuous non-stationary data stream that, due to their variable nature, are likely to experience concept drift. This research aimed to review the practical applications of online machine learning methods for IoT network intrusion detection, to answer the question if a resource efficient architecture can be provided? An online learning architecture is introduced, with related IDS approaches reviewed and evaluated. Online learning provides a potential memory and time efficient architecture that can adapt to concept drift and perform anomaly detection, providing solutions for the resource constrained and distributed IoT perception layer. Future research should focus on addressing class imbalance in the data streams to ensure that minority attack classes are not missed.

show abstract

Reviews in Online Data Stream and Active Learning for Cyber Intrusion Detection - A Systematic Literature Review

Nixon

¹

,

Sedky

²

,

Hassan

³

2021

View full text Add to dashboard Cite

SALAD: An Exploration of Split Active Learning based Unsupervised Network Data Stream Anomaly Detection using Autoencoders

Nixon¹,

Sedky²,

Hassan³

2023

Preprint

View full text Add to dashboard Cite

<div>Machine learning based intrusion detection systems monitor network data streams for cyber attacks. Challenges in this space include detection of unknown attacks, adaptation to changes in the data stream such as changes in underlying behaviour, the human cost of labeling data to retrain the machine learning model and the processing and memory constraints of a real-time data stream. Failure to manage the aforementioned factors could result in missed attacks, degraded detection performance, unnecessary expense or delayed detection times. This research evaluated autoencoders, a type of feed-forward neural network, as online anomaly detectors for network data streams. The autoencoder method was combined with an active learning strategy to further reduce labeling cost and speed up training and adaptation times, resulting in a proposed Split Active Learning Anomaly Detector (SALAD) method. The proposed method was evaluated with the NSL-KDD, KDD Cup 1999, and UNSW-NB15 data sets, using the scikit-multiflow framework. Results demonstrated that a novel Adaptive Anomaly Threshold method, combined with a split active learning strategy offered superior anomaly detection performance with a labeling budget of just 20%, significantly reducing the required human expertise to annotate the network data. Processing times of the autoencoder anomaly detector method were demonstrated to be significantly lower than traditional online learning methods, allowing for greatly improved responsiveness to attacks occurring in real time. Future research areas are applying unsupervised threshold methods, multi-label classification, sample annotation, and hybrid intrusion detection.</div>

show abstract

SALAD: An Exploration of Split Active Learning based Unsupervised Network Data Stream Anomaly Detection using Autoencoders

Nixon¹,

Sedky²,

Hassan³

2021

Preprint

View full text Add to dashboard Cite

<div>Machine learning based intrusion detection systems monitor network data streams for cyber attacks. Challenges in this space include detection of unknown attacks, adaptation to changes in the data stream such as changes in underlying behaviour, the human cost of labeling data to retrain the machine learning model and the processing and memory constraints of a real-time data stream. Failure to manage the aforementioned factors could result in missed attacks, degraded detection performance, unnecessary expense or delayed detection times. This research evaluated autoencoders, a type of feed-forward neural network, as online anomaly detectors for network data streams. The autoencoder method was combined with an active learning strategy to further reduce labeling cost and speed up training and adaptation times, resulting in a proposed Split Active Learning Anomaly Detector (SALAD) method. The proposed method was evaluated with the NSL-KDD, KDD Cup 1999, and UNSW-NB15 data sets, using the scikit-multiflow framework. Results demonstrated that a novel Adaptive Anomaly Threshold method, combined with a split active learning strategy offered superior anomaly detection performance with a labeling budget of just 20%, significantly reducing the required human expertise to annotate the network data. Processing times of the autoencoder anomaly detector method were demonstrated to be significantly lower than traditional online learning methods, allowing for greatly improved responsiveness to attacks occurring in real time. Future research areas are applying unsupervised threshold methods, multi-label classification, sample annotation, and hybrid intrusion detection.</div>

show abstract

Autoencoders

Nixon

¹

,

Sedky

²

,

Hassan

³

2020

4

0

View full text Add to dashboard Cite

Computer networks are vulnerable to cyber attacks that can affect the confidentiality, integrity and availability of mission critical data. Intrusion detection methods can be employed to detect these attacks in real-time. Anomaly detection offers the advantage of detecting unknown attacks in a semi-supervised fashion. This paper aims to answer the question if autoencoders, a type of semi-supervised feedforward neural network, can provide a low cost anomaly detector method for computer network data streams. Autoencoder methods were evaluated online with well known KDD Cup 1999 and UNSW-NB15 data sets, and it was demonstrated that running time and labeling cost is significantly reduced compared to traditional online classification techniques for similar detection performance. Further research would consider the trade-off between single vs stacked networks, multi-label classification, concept drift detection and active learning.

show abstract