Objective: Over the last few decades, there has been significant interest in the automatic analysis of respiratory sounds. However, currently there are no publicly available large databases with which new algorithms can be evaluated and compared. Further developments in the field are dependent on the creation of such databases. Approach: This paper describes a public respiratory sound database, which was compiled for an international competition, the first scientific challenge of the IFMBE’s International Conference on Biomedical and Health Informatics. The database includes 920 recordings acquired from 126 participants and two sets of annotations. One set contains 6898 annotated respiratory cycles, some including crackles, wheezes, or a combination of both, and some with no adventitious respiratory sounds. In the other set, precise locations of 10 775 events of crackles and wheezes were annotated. Main results: The best system that participated in the challenge achieved an average score of 52.5% with the respiratory cycle annotations and an average score of 91.2% with the event annotations. Significance: The creation and public release of this database will be useful to the research community and could bring attention to the respiratory sound classification problem.
The recently proposed Parkinson's Disease (PD) telediagnosis systems based on detecting dysphonia achieve very high classification rates in discriminating healthy subjects from PD patients. However, in these studies the data used to construct the classification model contain the speech recordings of both early and late PD patients with different severities of speech impairments resulting in unrealistic results. In a more realistic scenario, an early telediagnosis system is expected to be used in suspicious cases by healthy subjects or early PD patients with mild speech impairment. In this paper, considering the critical importance of early diagnosis in the treatment of the disease, we evaluate the ability of vocal features in early telediagnosis of Parkinson's Disease (PD) using machine learning techniques with a two-step approach. In the first step, using only patient data, we aim to determine the patient group with relatively greater severity of speech impairments using Unified Parkinson's Disease Rating Scale (UPDRS) score as an index of disease progression. For this purpose, we use three supervised and two unsupervised learning techniques. In the second step, we exclude the samples of this group of patients from the dataset, create a new dataset consisting of the samples of PD patients having less severity of speech impairments and healthy subjects, and use three classifiers with various settings to address this binary classification problem. In this classification problem, the highest accuracy of 96.4% and Matthew's Correlation Coefficient of 0.77 is obtained using support vector machines with third-degree polynomial kernel showing that vocal features can be used to build a decision support system for early telediagnosis of PD.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.