A computer-aided MFCC-based HMM system for automatic auscultation

Chauhan, Sunita; Wang, Ping; Lim, Chu Sing; Anantharaman, V.

doi:10.1016/j.compbiomed.2007.10.006

Cited by 110 publications

(63 citation statements)

References 5 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…Mel-Frequency Cepstral Coefficients (MFCC) are classical acoustic speech features used in automatic speech processing [16]. They are state-of-the-art features in many applications, including automatic speech recognition and speaker verification systems.…”

Section: Joint Feature Extractionmentioning

confidence: 99%

“…They are state-of-the-art features in many applications, including automatic speech recognition and speaker verification systems. For obtaining a MFFC feature vector, the voice signal is transformed into the frequency domain via windowed Fast Fourier Transform and then mapped on to the Mel scale, a human perceptual scale of frequency [16]. A (logarithmically spaced) filter bank is constructed over this Mel frequency spectrum, and from this the logarithm of the power spectrum is determined.…”

Section: Joint Feature Extractionmentioning

confidence: 99%

See 1 more Smart Citation

Biometric liveness checking using multimodal fuzzy fusion

Chetty

2010

International Conference on Fuzzy Systems

View full text Add to dashboard Cite

Abstract-In this paper we propose a novel fusion protocol based on fuzzy fusion of face and voice features for checking liveness in secure identity authentication systems based on face and voice biometrics. Liveness checking can detect fraudulent impostor attacks on the security systems, and ensure that biometric cues are acquired from a live person who is actually present at the time of capture for authenticating the identity. The proposed fuzzy fusion of audio visual features is based on mutual dependency models which extract the spatio-temporal correlation between face and voice dynamics during speech production, Performance evaluation in terms of DET (Detector Error Tradeoff) curves and EERs (Equal Error Rates) on publicly available audiovisual speech databases show a significant improvement in performance of proposed fuzzy fusion of face-voice features based on mutual dependency models over conventional fusion techniques.

show abstract

Section: Joint Feature Extractionmentioning

confidence: 99%

Section: Joint Feature Extractionmentioning

confidence: 99%

Biometric liveness checking using multimodal fuzzy fusion

Chetty

2010

International Conference on Fuzzy Systems

View full text Add to dashboard Cite

show abstract

“…MFCCs offer several benefits over wavelets such as decorrelated coefficients, which often perform better in linear models. Moreover, researchers have had success using MFCCs with HMMs for automatic auscultation classification [7].…”

Section: Spectral Analysis: Mfcc Featuresmentioning

confidence: 99%

Heart Sound Classification Based on Temporal Alignment Techniques

Ortiz

Phoo

Wiens

2016

2016 Computing in Cardiology Conference (CinC)

View full text Add to dashboard Cite

The ability to accurately stratify patients at risk of adverse cardiovascular outcomes using heart sound recordings could result in earlier treatment and improved patient outcomes. However, there remain several challenges associated with risk stratifying patients based on the phonocardiogram (PCG) alone. First, inter-patient differences can make it challenging to learn a model that generalizes well across patients. Second, heterogeneity introduced by the collection environment of the recordings can render a classifier trained on one population useless when applied to another. To address these challenges we explore the use of temporal alignment techniques, in particular dynamic time warping (DTW). Using DTW we compare heart sounds within and across subjects/recordings. These DTW based features, coupled with widely used spectral MFCC coefficients, serve as input to a linear SVM. Applied to the held-out test set our classifier obtained a test score of 82.4%, suggesting that temporal alignment techniques can effectively reduce the effects of inter-patient variability and mitigate the differences introduced by heterogeneous data collection environments. IntroductionIn cardiac auscultation an examiner uses a stethoscope to listen for unique and distinct sounds, that provide important data regarding the condition of the heart. Modern recording equipment captures these heart sounds as a phonocardiogram (PCG). In principle, these recordings could be used to automatically monitor patients and diagnose cardiac abnormalities. Yet, while auscultation is a common practice in patient exams, PCGs are not widely used clinically, where echocardiograms and electrocardiograms are more prevalent. This is due, in part, to the lack of robust algorithms for automatically classifying PCGs. To address this issue, the 2016 PhysioNet/CinC Challenge focused on the development of algorithms to classify PCGs collected from both clinical and nonclinical environments [1].Robust PCG classification algorithms must accurately identify cardiac abnormalities across patients and across diverse recording environments. To address challenges associated with inter-patient variability we borrow techniques that have been successfully applied in speech processing and ECG analysis, where similar issues arise [2][3][4]. In particular, we explore the use of dynamic time warping (DTW) in measuring similarity between heartbeats from the same subject and across subjects. Our experiments show that such DTW-based features can mitigate the differences introduced by heterogeneous data collection environments and improve classification performance, especially when training and test populations differ. MethodsIn this section we present our supervised learning system for classifying PCGs as either normal of abnormal. We begin by describing the signal segmentation, then move on to feature extraction and lastly explain the learning algorithm. SegmentationAs a first step, we segment the PCG recording into the fundamental heart sounds: S1 and S2 in addition to the s...

show abstract

“…Given f (Hz) frequency in the following equation should be used to express the frequency scale [14];…”

Section: Feature Extractionmentioning

confidence: 99%

“…In this stage, the speech command signals are prepared for smooth playback, followed by high-pass filtration and segmentation. During signal preprocessing, the following steps were included [14]:…”

Section: Signal Pre-processingmentioning

confidence: 99%

Real-Time Control of Mobile Robot Using HMM-based Speech Recognition System

Toylan¹,

Türkeş²,

Çağlarer³

2017

ANADOLU UNIVERSITY JOURNAL OF SCIENCE AND TECHNOLOGY a - Applied Sciences and Engineering

View full text Add to dashboard Cite

Human-robot interaction (HRI) is a significant area of interest in robotics which has attracted a wide variety of studies in recent years. In order to provide natural human-robot interaction, robots will have to acquire the skills to detect and to integrate meaningfully information from multiple modalities. In this paper, a practical speech-controlled mobile robot car system is presented and discussed. In this study the developed Hidden Markov Model (HMM) with separate word recognition system and real-time control were obtained on a mobile robot. Mel-Frequency Cepstral Coefficients (MFCC) were applied as features for the control design of mobile robot. In the study, 270 speech commands (İLERİ=forward, GERİ=backward, DUR=stop, SAĞA=right, SOLA=left) which are collected from 54 different people were applied to a series of mathematical operations and 12 cepstral coefficients were derived. Therefore, a database was generated by 12 cepstral coefficients. Thus, HMM model was trained and tested according to database. Speech data were classified in two groups as 90% training data and 10% test data. The recognition success rate of test commands was measured 94%.

show abstract

A computer-aided MFCC-based HMM system for automatic auscultation

Cited by 110 publications

References 5 publications

Biometric liveness checking using multimodal fuzzy fusion

Biometric liveness checking using multimodal fuzzy fusion

Heart Sound Classification Based on Temporal Alignment Techniques

Real-Time Control of Mobile Robot Using HMM-based Speech Recognition System

Contact Info

Product

Resources

About