2021
DOI: 10.1109/jbhi.2021.3064237
|View full text |Cite
|
Sign up to set email alerts
|

CNN-MoE Based Framework for Classification of Respiratory Anomalies and Lung Disease Detection

Abstract: This paper presents and explores a robust deep learning framework for auscultation analysis. This aims to classify anomalies in respiratory cycles and detect diseases, from respiratory sound recordings. The framework begins with front-end feature extraction that transforms input sound into a spectrogram representation. Then, a back-end deep learning network is used to classify the spectrogram features into categories of respiratory anomaly cycles or diseases. Experiments, conducted over the ICBHI benchmark dat… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
58
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 101 publications
(58 citation statements)
references
References 36 publications
0
58
0
Order By: Relevance
“…The accomplishments of the present study are the following: (i) the hybrid CNN-LSTM approach provides the best combination of performance (sensitivity, specificity, score, accuracy) in comparison with all previous relevant studies (including Pham et al [ 30 ] as results using random 5-fold CV are not reliable), (ii) the proposed model performs well for a highly imbalanced dataset, (iii) the FL function delivers better results than the classic CE function for ECG classification, and (iv) the proposed method could be used for real-time lung sound classification as the prediction phase lasts only a few seconds.…”
Section: Resultsmentioning
confidence: 86%
See 1 more Smart Citation
“…The accomplishments of the present study are the following: (i) the hybrid CNN-LSTM approach provides the best combination of performance (sensitivity, specificity, score, accuracy) in comparison with all previous relevant studies (including Pham et al [ 30 ] as results using random 5-fold CV are not reliable), (ii) the proposed model performs well for a highly imbalanced dataset, (iii) the FL function delivers better results than the classic CE function for ECG classification, and (iv) the proposed method could be used for real-time lung sound classification as the prediction phase lasts only a few seconds.…”
Section: Resultsmentioning
confidence: 86%
“…At the same time, Ma et al [ 28 ] introduced a non-local (NL) block into a ResNet and used STFT features for lung sound classification. Yang et al [ 29 ] analyzed STFT features with a ResNet with squeeze and excitation (SE) and spatial attention (SA) blocks for the identification of abnormal lung sounds, while another study by Pham et al [ 30 ] implemented a mixture-of-experts (MoE) block into a CNN structure and used mel spectrogram, gammatone-based spectrogram, MFCC and rectangular constant Q transform (CQT) features for the same purpose. Lastly, Nguyen and Pernkopf [ 31 ] implemented a ResNet to process mel spectrograms and classify respiratory sounds into four different categories.…”
Section: Related Workmentioning
confidence: 99%
“…Test Cond. Specificity(%) Sensitivity(%) Score(%) GMM-HMM [12] original split (60/40) ----39.56% Decision Tree [13] original split (60/40) 75% 12% 43% CNN-MoE [19] original split (60/40) 68% 26% 47% VGG-16(two path) [16] backbone but were beyond the SE-ResNet. For specificity values, the reverse applied.…”
Section: Systemmentioning
confidence: 99%
“…Furthermore, for these frameworks to be compatible with real time portable or wearable computational devices. This contribution is published in the 42nd Annual International Conferences of the IEEE Engineering in Medicine and Biology Society[32] and being considered for publication in IEEE Journal of Biomedical and Health Informatics[33], the 43th Annual International Conferences of the IEEE Engineering in Medicine and Biology Society[34]…”
mentioning
confidence: 99%
“…With the presence of this knowledge distillation, training the student network, therefore, aims to minimize two losses: (1) the Euclidean distance LOSS EU between the teacher and student embedding, and (2) the standard crossentropy loss LOSS EN on the student's classification output. The combined loss function is therefore, LOSS = (1 − γ)LOSS EN + γLOSS EU(33)…”
mentioning
confidence: 99%