The ability to accurately stratify patients at risk of adverse cardiovascular outcomes using heart sound recordings could result in earlier treatment and improved patient outcomes. However, there remain several challenges associated with risk stratifying patients based on the phonocardiogram (PCG) alone. First, inter-patient differences can make it challenging to learn a model that generalizes well across patients. Second, heterogeneity introduced by the collection environment of the recordings can render a classifier trained on one population useless when applied to another. To address these challenges we explore the use of temporal alignment techniques, in particular dynamic time warping (DTW). Using DTW we compare heart sounds within and across subjects/recordings. These DTW based features, coupled with widely used spectral MFCC coefficients, serve as input to a linear SVM. Applied to the held-out test set our classifier obtained a test score of 82.4%, suggesting that temporal alignment techniques can effectively reduce the effects of inter-patient variability and mitigate the differences introduced by heterogeneous data collection environments.
IntroductionIn cardiac auscultation an examiner uses a stethoscope to listen for unique and distinct sounds, that provide important data regarding the condition of the heart. Modern recording equipment captures these heart sounds as a phonocardiogram (PCG). In principle, these recordings could be used to automatically monitor patients and diagnose cardiac abnormalities. Yet, while auscultation is a common practice in patient exams, PCGs are not widely used clinically, where echocardiograms and electrocardiograms are more prevalent. This is due, in part, to the lack of robust algorithms for automatically classifying PCGs. To address this issue, the 2016 PhysioNet/CinC Challenge focused on the development of algorithms to classify PCGs collected from both clinical and nonclinical environments [1].Robust PCG classification algorithms must accurately identify cardiac abnormalities across patients and across diverse recording environments. To address challenges associated with inter-patient variability we borrow techniques that have been successfully applied in speech processing and ECG analysis, where similar issues arise [2][3][4]. In particular, we explore the use of dynamic time warping (DTW) in measuring similarity between heartbeats from the same subject and across subjects. Our experiments show that such DTW-based features can mitigate the differences introduced by heterogeneous data collection environments and improve classification performance, especially when training and test populations differ.
MethodsIn this section we present our supervised learning system for classifying PCGs as either normal of abnormal. We begin by describing the signal segmentation, then move on to feature extraction and lastly explain the learning algorithm.
SegmentationAs a first step, we segment the PCG recording into the fundamental heart sounds: S1 and S2 in addition to the s...