This paper introduces a framework for how to appropriately adopt and adjust machine learning (ML) techniques used to construct electrocardiogram (ECG)-based biometric authentication schemes. The proposed framework can help investigators and developers on ECG-based biometric authentication mechanisms define the boundaries of required datasets and get training data with good quality. To determine the boundaries of datasets, use case analysis is adopted. Based on various application scenarios on ECG-based authentication, three distinct use cases (or authentication categories) are developed. With more qualified training data given to corresponding machine learning schemes, the precision on ML-based ECG biometric authentication mechanisms are increased in consequence. The ECG time slicing technique with the R-peak anchoring is utilized in this framework to acquire ML training data with good quality. In the proposed framework four new measure metrics are introduced to evaluate the quality of the ML training and testing data. In addition, a Matlab toolbox, containing all proposed mechanisms, metrics, and sample data with demonstrations using various ML techniques, is developed and made publicly available for further investigation. For developing ML-based ECG biometric authentication, the proposed framework can guide researchers to prepare the proper ML setups and the ML training datasets along with three identified user case scenarios. For researchers adopting ML techniques to design new schemes in other research domains, the proposed framework is still useful for generating the ML-based training and testing datasets with good quality and utilizing new measure metrics.
An ever-increasing number of computing devices interconnected through wireless networks encapsulated in the cyber-physical-social systems and a significant amount of sensitive network data transmitted among them have raised security and privacy concerns. Intrusion detection system (IDS) is known as an effective defence mechanism and most recently machine learning (ML) methods are used for its development. However, Internet of Things (IoT) devices often have limited computational resources such as limited energy source, computational power and memory, thus, traditional ML-based IDS that require extensive computational resources are not suitable for running on such devices. This study thus is to design and develop a lightweight ML-based IDS tailored for the resource-constrained devices. Specifically, the study proposes a lightweight ML-based IDS model namely IMPACT (IMPersonation Attack deteCTion using deep auto-encoder and feature abstraction). This is based on deep feature learning with gradient-based linear Support Vector Machine (SVM) to deploy and run on resource-constrained devices by reducing the number of features through feature extraction and selection using a stacked autoencoder (SAE), mutual information (MI) and C4.8 wrapper. The IMPACT is trained on Aegean Wi-Fi Intrusion Dataset (AWID) to detect impersonation attack. Numerical results show that the proposed IMPACT achieved 98.22% accuracy with 97.64% detection rate and 1.20% false alarm rate and outperformed existing state-of-the-art benchmark models. Another key contribution of this study is the investigation of the features in AWID dataset for its usability for further development of IDS.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.