Background
Phonematic and semantic verbal fluency tasks (VFTs) are widely used to capture cognitive deficits in people with neurodegenerative diseases. Counting the total number of words produced within a given time frame constitutes the most commonly used analysis for VFTs. The analysis of semantic and phonematic word clusters can provide additional information about frontal and temporal cognitive functions. Traditionally, clusters in the semantic VFT are identified using fixed word lists, which need to be created manually, lack standardization, and are language specific. Furthermore, it is not possible to identify semantic clusters in the phonematic VFT using this technique.
Objective
The objective of this study was to develop a method for the automated analysis of semantically related word clusters for semantic and phonematic VFTs. Furthermore, we aimed to explore the cognitive domains captured by this analysis for people with Parkinson disease (PD).
Methods
People with PD performed tablet-based semantic (51/85, 60%) and phonematic (69/85, 81%) VFTs. For both tasks, semantic word clusters were determined using a semantic relatedness model based on a neural network trained on the Wikipedia (Wikimedia Foundation) text corpus. The cluster characteristics derived from this model were compared with those derived from traditional evaluation methods of VFTs and a set of neuropsychological parameters.
Results
For the semantic VFT, the cluster characteristics obtained through automated analyses showed good correlations with the cluster characteristics obtained through the traditional method. Cluster characteristics from automated analyses of phonematic and semantic VFTs correlated with the overall cognitive function reported by the Montreal Cognitive Assessment, executive function reported by the Frontal Assessment Battery and the Trail Making Test, and language function reported by the Boston Naming Test.
Conclusions
Our study demonstrated the feasibility of standardized automated cluster analyses of VFTs using semantic relatedness models. These models do not require manually creating and updating categorized word lists and, therefore, can be easily and objectively implemented in different languages, potentially allowing comparison of results across different languages. Furthermore, this method provides information about semantic clusters in phonematic VFTs, which cannot be obtained from traditional methods. Hence, this method could provide easily accessible digital biomarkers for executive and language functions in people with PD.