Artificial intelligence algorithms have a leading role in the field of cybersecurity and attack detection, being able to present better results in some scenarios than classic intrusion detection systems such as Snort or Suricata. In this sense, this research focuses on the evaluation of characteristics for different well-established Machine Leaning algorithms commonly applied to IDS scenarios. To do this, a categorization for cybersecurity data sets that groups its records into several groups is first considered. Making use of this division, this work seeks to determine which neural network model (multilayer or recurrent), activation function, and learning algorithm yield higher accuracy values, depending on the group of data. Finally, the results are used to determine which group of data from a cybersecurity data set are more relevant and representative for the intrusion detection, and the most suitable configuration of Machine Learning algorithm to decrease the computational load of the system.
Security in IoT networks is currently mandatory, due to the high amount of data that has to be handled. These systems are vulnerable to several cybersecurity attacks, which are increasing in number and sophistication. Due to this reason, new intrusion detection techniques have to be developed, being as accurate as possible for these scenarios. Intrusion detection systems based on machine learning algorithms have already shown a high performance in terms of accuracy. This research proposes the study and evaluation of several preprocessing techniques based on traffic categorization for a machine learning neural network algorithm. This research uses for its evaluation two benchmark datasets, namely UGR16 and the UNSW-NB15, and one of the most used datasets, KDD99. The preprocessing techniques were evaluated in accordance with scalar and normalization functions. All of these preprocessing models were applied through different sets of characteristics based on a categorization composed by four groups of features: basic connection features, content characteristics, statistical characteristics and finally, a group which is composed by traffic-based features and connection direction-based traffic characteristics. The objective of this research is to evaluate this categorization by using various data preprocessing techniques to obtain the most accurate model. Our proposal shows that, by applying the categorization of network traffic and several preprocessing techniques, the accuracy can be enhanced by up to 45%. The preprocessing of a specific group of characteristics allows for greater accuracy, allowing the machine learning algorithm to correctly classify these parameters related to possible attacks.
Network Digital Twin (NDT) is a new technology that builds on the concept of Digital Twins (DT) to create a virtual representation of the physical objects of a telecommunications network. NDT bridges physical and virtual spaces to enable coordination and synchronization of physical parts while eliminating the need to directly interact with them. There is broad consensus that Artificial Intelligence (AI) and Machine Learning (ML) are among the key enablers to this technology. In this work, we present B5GEMINI, which is an NDT for 5G and beyond networks that makes an extensive use of AI and ML. First, we present the infrastructural and architectural components that support B5GEMINI. Next, we explore four paradigmatic applications where AI/ML can leverage B5GEMINI for building new AI-powered applications. In addition, we identify the main components of the AI ecosystem of B5GEMINI, outlining emerging research trends and identifying the open challenges that must be solved along the way. Finally, we present two relevant use cases in the application of NDTs with an extensive use of ML. The first use case lays in the cybersecurity domain and proposes the use of B5GEMINI to facilitate the design of ML-based attack detectors and the second addresses the design of energy efficient ML components and introduces the modular development of NDTs adopting the Digital Map concept as a novelty.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.