A Novel Sensor Data Pre-Processing Methodology for the Internet of Things Using Anomaly Detection and Transfer-By-Subspace-Similarity Transformation

Zhong, Yan; Fong, Simon; Hu, Shimin; Wong, Raymond K.; Lin, Weiwei

doi:10.3390/s19204536

Cited by 13 publications

(10 citation statements)

References 25 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In [14], it was recommended fresh IDS by utilizing Transfer by Subspace Similarity classification algorithm that improves the gathered data quality by computing the relevance correlations between variable and filling the missing values in the dataset. Assessment conducted on NSL-KDD database demonstrated that the method is highly successful compared with that which is reliant on Transfer by Subspace Similarity technique.…”

Section: Related Workmentioning

confidence: 99%

Hybrid Deep-GAN Model for Intrusion Detection in IoT Through Enhanced Whale Optimization

Balaji¹,

Narayanan²

2022

IJC

View full text Add to dashboard Cite

IoT networks emerging as a significant growth in modern communication technological applications. The network formed with sensor nodes with resource restrictions in complexity, open wireless transmission features lead them prone to security threats. An efficient Intrusion Detection System aids in detecting attacks and performs crucial counter act to promise secure and reliable function. However, for the reason of the widespread nature of IoT, the intrusion detection system is supposed to carry out in discrete form with fewer fascination on common manager. In order to conquer these issues, Distributed – Generative Adversarial Network (D-GAN) with Enhanced Whale Optimization – Distributed deep learning based on Artificial Neural Network (EWO-HDL+ANN) is proposed. Here the GAN can detect internal attacks and the D-GAN is capable of detecting both internal and external attacks effectively. Transfer By Subspace Similarity is engaged to carry out. After that the preprocessed data is fed into feature extraction stage. Modified Principal Component Analysis (MPCA) is applied to feature extraction, which is used to extract new features that are enlightened. Then, feature selection is executed by Enhanced Whale Optimization Algorithm, which is used to choose significant and superfluous features from the dataset. It gets better the classification accuracy through the greatest fitness value. Then the intrusion detection is evaluated by applying HDL+ANN algorithm used to detect the attacks powerfully. The experimental conclusion proves that the introduced EWO-DDL+ANN method provides enhanced intrusion detection system in the view of greater accuracy, precision, recall, f-measure and low False Positive Rate.

show abstract

Section: Related Workmentioning

confidence: 99%

Hybrid Deep-GAN Model for Intrusion Detection in IoT Through Enhanced Whale Optimization

Balaji¹,

Narayanan²

2022

IJC

View full text Add to dashboard Cite

show abstract

“…In general, vast amounts of sensing data from industrial IoT deployed in poor environments are frequently generated by repetitive and irregular data owing to various transmission errors, sensor malfunctions, and cases of external interference, which can eventually lead to poor performance due to incorrect predictions being made in case-based reasoning [ 19 , 20 ]. Therefore, we propose the use of data collection, data preprocessing, and ANN to effectively improve prediction performance through the accurate pattern learning of sensed data [ 21 ].…”

Section: Data Miningmentioning

confidence: 99%

Intelligent Dynamic Real-Time Spectrum Resource Management for Industrial IoT in Edge Computing

Yun

Lee

2021

Sensors

View full text Add to dashboard Cite

Intelligent dynamic spectrum resource management, which is based on vast amounts of sensing data from industrial IoT in the space–time and frequency domains, uses optimization algorithm-based decisions to minimize levels of interference, such as energy consumption, power control, idle channel allocation, time slot allocation, and spectrum handoff. However, these techniques make it difficult to allocate resources quickly and waste valuable solution information that is optimized according to the evolution of spectrum states in the space–time and frequency domains. Therefore, in this paper, we propose the implementation of intelligent dynamic real-time spectrum resource management through the application of data mining and case-based reasoning, which reduces the complexity of existing intelligent dynamic spectrum resource management and enables efficient real-time resource allocation. In this case, data mining and case-based reasoning analyze the activity patterns of incumbent users using vast amounts of sensing data from industrial IoT and enable rapid resource allocation, making use of case DB classified by case. In this study, we confirmed a number of optimization engine operations and spectrum resource management capabilities (spectrum handoff, handoff latency, energy consumption, and link maintenance) to prove the effectiveness of the proposed intelligent dynamic real-time spectrum resource management. These indicators prove that it is possible to minimize the complexity of existing intelligent dynamic spectrum resource management and maintain efficient real-time resource allocation and reliable communication; also, the above findings confirm that our method can achieve a superior performance to that of existing spectrum resource management techniques.

show abstract

“…Choose a state from the state set as the initial state S(t); while t is less than the user's termination iterations do Choose an action from the action set {G, H, I} by e-greedy strategy according to the definition of action in Part C of Section 3; Execute the selected action on the current state S(t) to jump to the next state S(t + 1); Perform crossover operation with global variable on the features contained in state S(t + 1); Calculate the fitness of the multidimensional data discretization scheme after crossover operation using equation 5; Measure the corresponding reward using equation (11) according to the definition of reward in Part C of Section 3; Update crossover Q-Table using equation 6; if the fitness of the multidimensional data discretization scheme > local variable do Update local variable with the fitness of the multidimensional data discretization scheme; end Perform crossover operation in P(t); Calculate the fitness of each individual in P(t) using equation 8; Update global variable with the optimal individual fitness value in P(t); S(t) � S(t + 1); Choose an action from the action set {G, H, I} by e-greedy strategy according to the definition of action in Part C of Section 3; Execute the selected action on the current state S(t) to jump to the next state S(t + 1); Perform mutation operation on the features contained in state S(t + 1) Calculate the fitness of the multidimensional data discretization scheme after mutation operation using equation 5; Measure the corresponding reward using equation (11) according to the definition of reward in Part C of Section 3; Update mutation Q-Table using equation (6); if the fitness of the multidimensional data discretization scheme > local variable do Update local variable with the fitness of the multidimensional data discretization scheme; end Perform mutation operation in P(t); Calculate the fitness of each individual in P(t) using equation (8); Update global variable with the optimal individual fitness value in P(t); t � t + 1; end Return Max(global variable, local variable); end ALGORITHM 2: RLGA algorithm process.…”

Section: Configuration Of Experimentalmentioning

confidence: 99%

“…ese are mainly from various types of sensors, with high-dimensional, incomplete, random, fuzzy, and strong interference and other characteristics [6]. Despite the growing body of artificial intelligence research, how to extract and analyze valuable information from these massive amounts of complex sensor data is still a huge challenge in the field of artificial intelligence [7][8][9]. As one of the most influential data preprocessing technologies, feature discretization can reduce the complexity of data by transforming the continuous features in massive data to discrete features and obtain shorter, more accurate, and more comprehensible rules, so as to improve the efficiency of data mining and machine learning [10][11][12][13][14][15][16].…”

Section: Introductionmentioning

confidence: 99%

Reinforcement Learning-Based Genetic Algorithm in Optimizing Multidimensional Data Discretization Scheme

Chen

Huang

et al. 2020

Mathematical Problems in Engineering

View full text Add to dashboard Cite

Feature discretization can reduce the complexity of data and improve the efficiency of data mining and machine learning. However, in the process of multidimensional data discretization, limited by the complex correlation among features and the performance bottleneck of traditional discretization criteria, the schemes obtained by most algorithms are not optimal in specific application scenarios and can even fail to meet the accuracy requirements of the system. Although some swarm intelligence algorithms can achieve better results, it is difficult to formulate appropriate strategies without prior knowledge, which will make the search in multidimensional space inefficient, consume many computing resources, and easily fall into local optima. To solve these problems, this paper proposes a genetic algorithm based on reinforcement learning to optimize the discretization scheme of multidimensional data. We use rough sets to construct the individual fitness function, and we design the control function to dynamically adjust population diversity. In addition, we introduce a reinforcement learning mechanism to crossover and mutation to determine the crossover fragments and mutation points of the discretization scheme to be optimized. We conduct simulation experiments on Landsat 8 and Gaofen-2 images, and we compare our method to the traditional genetic algorithm and state-of-the-art discretization methods. Experimental results show that the proposed optimization method can further reduce the number of intervals and simplify the multidimensional dataset without decreasing the data consistency and classification accuracy of discretization.

show abstract

A Novel Sensor Data Pre-Processing Methodology for the Internet of Things Using Anomaly Detection and Transfer-By-Subspace-Similarity Transformation

Cited by 13 publications

References 25 publications

Hybrid Deep-GAN Model for Intrusion Detection in IoT Through Enhanced Whale Optimization

Hybrid Deep-GAN Model for Intrusion Detection in IoT Through Enhanced Whale Optimization

Intelligent Dynamic Real-Time Spectrum Resource Management for Industrial IoT in Edge Computing

Reinforcement Learning-Based Genetic Algorithm in Optimizing Multidimensional Data Discretization Scheme

Contact Info

Product

Resources

About