Purpose:
Slow speech rate and abnormal temporal prosody are primary diagnostic criteria for differentiating between people with aphasia who do and do not have apraxia of speech. We sought to identify appropriate cutoff values for abnormal word syllable duration (WSD) in a word repetition task, interpret them relative to a data set of people with chronic aphasia, and evaluate the extent to which manually derived measures could be approximated through an automated process that relied on commercial speech recognition technology.
Method:
Fifty neurotypical participants produced 49 multisyllabic words during a repetition task. Audio recordings were submitted to an automated speech recognition (ASR) service (IBM Watson) to measure word duration and generate an orthographic transcription. The transcribed words were compared to a lexical database, and the number of syllables was identified. Automatic and manual measures were compared for 50% of the sample. Results were interpreted relative to WSD scores from an existing data set of 195 people with mostly chronic aphasia.
Results:
ASR correctly identified 83% of target words and 98% of target syllable counts. Automated word duration calculations were longer than manual measures due to imprecise cursor placement. Upon applying regression coefficients to the automated measures and examining the frequency distributions for both manual and estimated measures, a WSD of 303–316 ms was found to indicate longer-than-normal performance (corresponding to the 95th percentile). With this cutoff, 40%–45% of participants with aphasia in our comparison sample had an abnormally long WSD.
Conclusions:
We recommend using a rounded WSD cutoff score between 303 and 316 ms for manual measures. Future research will focus on customizing automated WSD methods to speech samples from people with aphasia, identifying target words that maximize production and measurement reliability, and developing WSD standard scores based on a large participant sample with and without aphasia.