Dominique Fohr scite author profile

Dominique Fohr

2Publications

57Citation Statements Received

74Citation Statements Given

How they've been cited

How they cite others

Affiliations

Lorraine Research Laboratory in Computer Science and its Applications, Université de Lorraine, French Institute for Research in Computer Science and Automation

Publications

Order By: Most citations

A wavelet-based parameterization for speech/music discrimination

Didiot

Illina

Fohr

et al. 2010

Computer Speech & Language

View full text Add to dashboard Cite

RésuméThis paper addresses the problem of parameterization for speech/music discrimination. The current successful parameterization based on cepstral coefficients uses the Fourier transformation (FT), which is well adapted for stationary signals. In order to take into account the non stationarity of music/speech signals, this work proposes to study wavelet-based signal decomposition instead of FT. Three wavelet families and several numbers of vanishing moments have been evaluated. Different types of energy, calculated for each frequency band obtained from wavelet decomposition, are studied. Static, dynamic and long-term parameters were evaluated. The proposed parameterization are integrated into two class/non-class classifiers: one for speech/non-speech, one for music/non-music. Different experiments on realistic corpora, including different styles of speech and music (Broadcast News, Entertainment, Scheirer), illustrate the performance of the proposed parameterization, especially for music/non-music discrimination. Our parameterization yielded a significant reduction of the error rate. More than 30% relative improvement was obtained for the envisaged tasks compared to MFCC parameterization.

show abstract

Keyword Based Speaker Localization: Localizing a Target Speaker in a Multi-speaker Environment

Sivasankaran¹,

Fohr²

2018

View full text Add to dashboard Cite

To cite this version:Sunit Sivasankaran, Emmanuel Vincent, Dominique Fohr. Keyword-based speaker localization: Localizing a target speaker in a multi-speaker environment. Interspeech 2018 -19th AbstractSpeaker localization is a hard task, especially in adverse environmental conditions involving reverberation and noise. In this work we introduce the new task of localizing the speaker who uttered a given keyword, e.g., the wake-up word of a distantmicrophone voice command system, in the presence of overlapping speech. We employ a convolutional neural network based localization system and investigate multiple identifiers as additional inputs to the system in order to characterize this speaker.We conduct experiments using ground truth identifiers which are obtained assuming the availability of clean speech and also in realistic conditions where the identifiers are computed from the corrupted speech. We find that the identifier consisting of the ground truth time-frequency mask corresponding to the target speaker provides the best localization performance and we propose methods to estimate such a mask in adverse reverberant and noisy conditions using the considered keyword.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Dominique Fohr

A wavelet-based parameterization for speech/music discrimination

Keyword Based Speaker Localization: Localizing a Target Speaker in a Multi-speaker Environment

Contact Info

Product

Resources

About