Motivation
Deep neural network (DNN) algorithms were utilized in predicting various biomedical phenotypes recently, and demonstrated very good prediction performances without selecting features. This study proposed a hypothesis that the DNN models may be further improved by feature selection algorithms.
Results
A comprehensive comparative study was carried out by evaluating 11 feature selection algorithms on three conventional DNN algorithms, i.e. convolution neural network (CNN), deep belief network (DBN) and recurrent neural network (RNN), and three recent DNNs, i.e. MobilenetV2, ShufflenetV2 and Squeezenet. Five binary classification methylomic datasets were chosen to calculate the prediction performances of CNN/DBN/RNN models using feature selected by the 11 feature selection algorithms. Seventeen binary classification transcriptome and two multi-class transcriptome datasets were also utilized to evaluate how the hypothesis may generalize to different data types. The experimental data supported our hypothesis that feature selection algorithms may improve DNN models, and the DBN models using features selected by SVM-RFE usually achieved the best prediction accuracies on the five methylomic datasets.
Availability and implementation
All the algorithms were implemented and tested under the programming environment Python version 3.6.6.
Supplementary information
Supplementary data are available at Bioinformatics online.
The neurological disorder epilepsy causes substantial problems to the patients with uncontrolled seizures or even sudden deaths. Accurate detection and prediction of epileptic seizures will significantly improve the life quality of epileptic patients. Various feature extraction algorithms were proposed to describe the EEG signals in frequency or time domains. Both invasive intracranial and non-invasive scalp EEG signals have been screened for the epileptic seizure patterns. This study extracted a comprehensive list of 24 feature types from the scalp EEG signals and found 170 out of the 2794 features for an accurate classification of epileptic seizures. An accuracy (Acc) of 99.40% was optimized for detecting epileptic seizures from the scalp EEG signals. A balanced accuracy (bAcc) was calculated as the average of sensitivity and specificity and our seizure detection model achieved 99.61% in bAcc. The same experimental procedure was applied to predict epileptic seizures in advance, and the model achieved Acc = 99.17% for predicting epileptic seizures 10 s before happening.
A comparative analysis was also carried out on related transcriptomic datasets, which indicates that the proposed biomarkers provide discerning power for accurate stage prediction, and will be improved when larger-scale proteomic quantitative technologies become available.
Aim: Breast cancer histologic grade (HG) is a well-established prognostic factor. This study aimed to select methylomic biomarkers to predict breast cancer HGs. Materials & methods: The proposed algorithm BioDog firstly used correlation bias reduction strategy to eliminate redundant features. Then incremental feature selection was applied to find the features with a high HG prediction accuracy. The sequential backward feature elimination strategy was employed to further refine the biomarkers. A comparison with existing algorithms were conducted. The HG-specific somatic mutations were investigated. Results & conclusions: BioDog achieved accuracy 0.9973 using 92 methylomic biomarkers for predicting breast cancer HGs. Many of these biomarkers were within the genes and lncRNAs associated with the HG development in breast cancer or other cancer types.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.