Individuals routinely search through memory for concepts. This behavior is commonly studied via the verbal fluency task (VFT), where participants are typically asked to generate as many exemplars as they can from a given category (e.g., animals) or letter label (e.g., F) within a fixed amount of time. Responses in the VFT tend to be clustered in meaningful ways but individuals widely differ in the manner in which they cluster items. Despite the development of several (hand-coded and automated) methods of defining clusters and switches in the VFT, there is currently no consensus on which scoring method provides the best mechanistic account of how individuals search through memory in the VFT. In this work, we provide an empirical evaluation of several automated methods for defining clusters and switches in the VFT by comparing model-predicted clusters with participant-designated clusters. We find that a method that combines gradual rises and drops in a weighted composite of semantic and phonological similarity best predicts participant-designated cluster-switch events across three domains (animals, foods, and occupations). Furthermore, we propose a novel approach to understand idiosyncratic search behavior by computing a measure of discordance for each pairwise transition based on a large dataset of cluster-switch designations from independent raters (N = 211) for the same transitions via a pre-registered experiment. We find that transitions with high idiosyncratic scores have low lexical content (i.e., semantic and phonological similarity), and an individual’s score on one domain is predictive of their score on another domain, suggesting that idiosyncratic scores may be capturing meaningful information about non-lexical sources and processes that contribute to memory search at the individual level.