The article is devoted to the application of machine learning methods for computerized detection of sleep apnea episodes based on the analysis of single-channel signals of the electrocardiogram (ECG) and electroencephalogram (EEG). To study the possibilities of machine learning to detect apnea based on ECG and EEG analysis, we used Apnea-ECG database and MIT-BIH polysomnographic database from PhysioNet, which contain annotations to each minute of records indicating the presence or absence of apnea/hypopnea at the current time. In order to apply machine learning methods to the problem of automated detection of sleep apnea/hypopnea episodes in ECG and EEG signals, long-term polysomnograms available in MIT-BIH polysomnographic database were segmented according to annotations into shorter sections lasting 30 seconds each. The study used 267 segments lasting 30 seconds for the class "norm", 258 segments for the class "apnea" and 273 segments for the class "hypopnea", a total of 798 simultaneous ECG and EEG recordings.
The aim of this work is to identify and compare informative signs of sleep apnea episodes in terms of heart rate variability (HRV) and brain electrical activity, as well as the choice of classification methods that provide the highest accuracy for this task. Features of cardiorhythmograms in time and frequency domains, spectral-temporal and wavelet characteristics, as well as parameters of EEG signals based on energy ratio of EEG rhythms, Hearst index, Higuchi fractal dimension and sample entropy for EEG signals are considered. Using different sets of features, the accuracy of classifiers based on decision trees, discriminant analysis, support vector machines, k-nearest neighbor method, and ensemble training was determined. Based on this, combination of features and classifiers is proposed, which provides the highest accuracy of recognition of sleep apnea episodes according to single-channel ECG and EEG signals, taken separately and in the case of a combination of their features.
The best results of classification of signals "norm", "apnea" and "hypopnea" were obtained for the model trained using weighted method k nearest neighbors with 25 features of HRV: the total percentage of correctly identified cases for three classes was 99.9% (797 correctly identified cases of 798). By reducing the number of HRV parameters to 9, the best machine learning result was achieved using the bagging ensemble algorithm with 30 decision trees: the total percentage of correctly identified cases for all three classes was 99.4% (793 correctly identified cases from 798: for "norm" - 265 cases from 267, for "apnea" - 257 cases from 258, for "hypopnea" - 271 cases from 273). The use of EEG parameters as features for apnea/hypopnea recognition showed worse results compared to HRV parameters. In this case, the best result of machine learning was achieved using support vector machines with quadratic kernel function: the total percentage of correctly identified cases for three classes was 91.9% and the signals corresponding to norm were most badly recognized (27 cases were classified as hypopnea, and in 9 cases - as sleep apnea). The combination of HRV and EEG parameters gave the best accuracy of 99.1%, but the results are comparable to using only HRV parameters. The obtained results indicate that HRV parameters allow recognizing sleep apnea and hypopnea with higher accuracy than EEG parameters, but EEG signal undoubtedly reflects signs of sleep apnea/hypopnea and also can be used for apnea recognition.