IntroductionWe investigated the agreement between automated and gold‐standard manual transcriptions of telephone chatbot‐based semantic verbal fluency testing.MethodsWe examined 78 cases from the Screening over Speech in Unselected Populations for Clinical Trials in AD (PROSPECT‐AD) study, including cognitively normal individuals and individuals with subjective cognitive decline, mild cognitive impairment, and dementia. We used Bayesian Bland–Altman analysis of word count and the qualitative features of semantic cluster size, cluster switches, and word frequencies.ResultsWe found high levels of agreement for word count, with a 93% probability of a newly observed difference being below the minimally important difference. The qualitative features had fair levels of agreement. Word count reached high levels of discrimination between cognitively impaired and unimpaired individuals, regardless of transcription mode.DiscussionOur results support the use of automated speech recognition particularly for the assessment of quantitative speech features, even when using data from telephone calls with cognitively impaired individuals in their homes.Highlights
High levels of agreement were found between automated and gold‐standard manual transcriptions of telephone chatbot‐based semantic verbal fluency testing, particularly for word count.
The qualitative features had fair levels of agreement.
Word count reached high levels of discrimination between cognitively impaired and unimpaired individuals, regardless of transcription mode.
Automated speech recognition for the assessment of quantitative and qualitative speech features, even when using data from telephone calls with cognitively impaired individuals in their homes, seems feasible and reliable.