Interspeech 2022 2022
DOI: 10.21437/interspeech.2022-182
|View full text |Cite
|
Sign up to set email alerts
|

Automatic Assessment of Speech Intelligibility using Consonant Similarity for Head and Neck Cancer

Abstract: The automatic prediction of speech intelligibility is a widely known problem in the context of pathological speech. It has been seen as a growing and viable alternative to perceptual evaluation, which is typically time-consuming, highly subjective and strongly biased. Due to this, the development of automatic systems that are able to output not only unbiased predictions, but also interpretable scores become relevant. In this paper we investigate a method to predict speech intelligibility based on consonant pho… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4

Citation Types

0
4
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
3
3

Relationship

4
2

Authors

Journals

citations
Cited by 6 publications
(4 citation statements)
references
References 24 publications
0
4
0
Order By: Relevance
“…Other approaches make use of speech processing technologies based on the extraction of relevant features, such as MFCCs, filterbanks, speaker embeddings, etc. [7,8,9,10]. Nevertheless, it is known that these type of datadriven systems tend to require significant amounts of data in order to operate efficiently [11].…”
Section: Introductionmentioning
confidence: 99%
“…Other approaches make use of speech processing technologies based on the extraction of relevant features, such as MFCCs, filterbanks, speaker embeddings, etc. [7,8,9,10]. Nevertheless, it is known that these type of datadriven systems tend to require significant amounts of data in order to operate efficiently [11].…”
Section: Introductionmentioning
confidence: 99%
“…A variety of approaches and methodologies have been used recently to automatically predict clinical perceptual measures, mainly speech intelligibility. Among these approaches different schools of thought can be identified, such as regressing scores based on automatic speech recognition performance (Schuster et al, 2006;Christensen et al, 2012;Fontan et al, 2017) and also the usage of more traditional signal processing techniques and data-driven methodologies (Bin et al, 2019;Quintas et al, 2022Quintas et al, , 2023a. The speaker embedding paradigm, where speech utterances are represented into fixed-dimensional vectors that have discriminating properties among different speakers, has shown interesting gains on general pathological speech assessment (Codosero et al, 2019;Zargarbashi and Babaali, 2019) as well as the specific case of intelligibility prediction (Laaridh et al, 2018;Quintas et al, 2020).…”
Section: Introductionmentioning
confidence: 99%
“…On the other hand, these systems that take in consideration the perception uncertainty normally do so in order to increase the performance metrics on the gold standard, instead of identifying and assessing individual judge profiles that can come across. Nevertheless, despite the recent progress in the automatic prediction of speech intelligibility, score interpretability is still a big issue (Quintas et al., 2022) which typically impairs the widespread clinical usage of these technologies.…”
Section: Introductionmentioning
confidence: 99%