Anomaly detection, which aims to identify observations that deviate from a nominal sample, is a challenging task for high-dimensional data. Traditional distance-based anomaly detection methods compute the neighborhood distance between each observation and suffer from the curse of dimensionality in high-dimensional space; for example, the distances between any pair of samples are similar and each sample may perform like an outlier. In this paper, we propose a hybrid semi-supervised anomaly detection model for high-dimensional data that consists of two parts: a deep autoencoder (DAE) and an ensemble k-nearest neighbor graphs- (K-NNG-) based anomaly detector. Benefiting from the ability of nonlinear mapping, the DAE is first trained to learn the intrinsic features of a high-dimensional dataset to represent the high-dimensional data in a more compact subspace. Several nonparametric KNN-based anomaly detectors are then built from different subsets that are randomly sampled from the whole dataset. The final prediction is made by all the anomaly detectors. The performance of the proposed method is evaluated on several real-life datasets, and the results confirm that the proposed hybrid model improves the detection accuracy and reduces the computational complexity.
Microwave-based breast cancer detection has been proposed as a complementary approach to compensate for some drawbacks of existing breast cancer detection techniques. Among the existing microwave breast cancer detection methods, machine learning-type algorithms have recently become more popular. These focus on detecting the existence of breast tumours rather than performing imaging to identify the exact tumour position. A key component of the machine learning approaches is feature extraction. One of the most widely used feature extraction method is principle component analysis (PCA). However, it can be sensitive to signal misalignment. This paper proposes feature extraction methods based on time-frequency representations of microwave data, including the wavelet transform and the empirical mode decomposition. Time-invariant statistics can be generated to provide features more robust to data misalignment. We validate results using clinical data sets combined with numerically simulated tumour responses. Experimental results show that features extracted from decomposition results of the wavelet transform and EMD improve the detection performance when combined with an ensemble selection-based classifier.
Microwave-based breast cancer detection has been proposed as a complementary approach to compensate for some drawbacks of existing breast cancer detection techniques. Among the existing microwave breast cancer detection methods, machine learning-type algorithms have recently become more popular. These focus on detecting the existence of breast tumours rather than performing imaging to identify the exact tumour position. A key step of the machine learning approaches is feature extraction. One of the most widely used feature extraction method is principle component analysis (PCA). However, it can be sensitive to signal misalignment. This paper presents an empirical mode decomposition (EMD)-based feature extraction method, which is more robust to the misalignment. Experimental results involving clinical data sets combined with numerically simulated tumour responses show that combined features from EMD and PCA improve the detection performance with an ensemble selection-based classifier.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.