ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2020
DOI: 10.1109/icassp40776.2020.9054765
|View full text |Cite
|
Sign up to set email alerts
|

Synthetic Speech References for Automatic Pathological Speech Intelligibility Assessment

Abstract: Automatic pathological speech intelligibility measures are crucial to assist the clinical diagnosis and treatment of speech disorders. The recently proposed pathological short-time objective intelligibility (P-ESTOI) measure was shown to be very advantageous, yielding a high performance for several speech pathologies. However, to assess the intelligibility of an utterance from a patient, P-ESTOI relies on the availability of recordings of the same utterance by several healthy speakers such that an intelligible… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
2
0

Year Published

2021
2021
2021
2021

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(2 citation statements)
references
References 21 publications
0
2
0
Order By: Relevance
“…More recently a short-time objective intelligibility (STOI) approach was proposed in [53]. First utterance-dependent reference signal from multiple healthy speakers was constructed using Dynamic Time Warping (DTW).…”
Section: Reference-based Approachesmentioning
confidence: 99%
See 1 more Smart Citation
“…More recently a short-time objective intelligibility (STOI) approach was proposed in [53]. First utterance-dependent reference signal from multiple healthy speakers was constructed using Dynamic Time Warping (DTW).…”
Section: Reference-based Approachesmentioning
confidence: 99%
“…This method, called P-STOI, was evaluated on French and English speakers. Subsequently in [53], an improvised method was proposed which used synthetic speech generated by a text-to-speech (TTS) systems to create a reference speech signal. Spectral bases of the octave band representations of speech was exploited in [54] by first finding subspaces of spectral patterns characterizing intelligible (healthy) and pathological speech using Principal Component Analysis (PCA) or Approximate Joint Diagonalization (AJD) and then measuring the Grassman distance between the two subspaces.…”
Section: Reference-based Approachesmentioning
confidence: 99%