Abstract:Abstract-The paper presents unsupervised method for word detection in recorded spoken language signal. The method is based on examining signal similarity of two analyzed media description: registered voice and a word (textual query) synthesized by using Text-to-Speech tools. The descriptions of media were given by a sequence of Mel-Frequency Cepstral Coefficients or Human-Factor Cepstral Coefficients. Dynamic Time Warping algorithm has been applied to provide time alignment of the given media description. The … Show more
“…This paper considers the same use case as in [1]. In this approach the KWS method supports human operator in searching for specific words in a given speech medium.…”
Section: A Methods Backgroundmentioning
confidence: 99%
“…Using textual query and TTS make it easy to extend the approach to reflect language variations assumed in the scenario, to search for the same word translated to several languages [1].…”
Section: Textual Querymentioning
confidence: 99%
“…Optimal means here the lowest cost path P for passing from one point of matrix to another, within given constraints. For details of applying DTW to exemplary speech features vectors, see [1]. …”
Section: Similarity and Time Alignmentmentioning
confidence: 99%
“…Regarding this trend [1] considers an approach that could be used to detect words in recorded speech of unknown language without training, by using publicly available, free of charge online translation services with Text-To-Speech support e.g. : Google Translate, Bing Translator, Yandex Translate …”
Section: Introductionmentioning
confidence: 99%
“…The approach presented in [1] employs cepstrum-based features: MelFrequency Cepstral Coefficients (MFCC) and Human-Factor Cepstral Coefficients (HFCC). As for classification strategy that approach points Dynamic Time Warping (DTW) algorithm.…”
“…This paper considers the same use case as in [1]. In this approach the KWS method supports human operator in searching for specific words in a given speech medium.…”
Section: A Methods Backgroundmentioning
confidence: 99%
“…Using textual query and TTS make it easy to extend the approach to reflect language variations assumed in the scenario, to search for the same word translated to several languages [1].…”
Section: Textual Querymentioning
confidence: 99%
“…Optimal means here the lowest cost path P for passing from one point of matrix to another, within given constraints. For details of applying DTW to exemplary speech features vectors, see [1]. …”
Section: Similarity and Time Alignmentmentioning
confidence: 99%
“…Regarding this trend [1] considers an approach that could be used to detect words in recorded speech of unknown language without training, by using publicly available, free of charge online translation services with Text-To-Speech support e.g. : Google Translate, Bing Translator, Yandex Translate …”
Section: Introductionmentioning
confidence: 99%
“…The approach presented in [1] employs cepstrum-based features: MelFrequency Cepstral Coefficients (MFCC) and Human-Factor Cepstral Coefficients (HFCC). As for classification strategy that approach points Dynamic Time Warping (DTW) algorithm.…”
The paper presents the application of unsupervised method to word detection in recorded speech for the spoken Polish language. The method utilizes similarity measure between analyzed speech and a pattern synthesized from pure text. Dynamic time warping algorithm is applied for time alignment and the resulting alignment path defines an input to the classifier. The classification process involves calculation of cost function and extraction of the projected sequence of Human-Factor Cepstral Coefficients, both of which are compared with the threshold values. The results obtained after application of the method to the CLARIN-PL Mobile Corpus are encouraging to develop this method for the Polish language.Communication
The paper describes an evaluation of the application of selected similarity functions in the task of keyword spotting. Experiments were carried out in the Polish language. The research results can be used to improve already existing keyword spotting methods, or to develop new ones.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.