Since its establishment, psychology has struggled to find valid methods for studying thoughts and subjective experiences. Thirty years ago, Ericsson and Simon (1980) proposed that participants can give concurrent verbal expression to their thoughts (think aloud) while completing tasks without changing objectively measurable performance (accuracy). In contrast, directed requests for concurrent verbal reports, such as explanations or directions to describe particular kinds of information, were predicted to change thought processes as a consequence of the need to generate this information, thus altering performance. By comparing performance of concurrent verbal reporting conditions with their matching silent control condition, Ericsson and Simon found several studies demonstrating that directed verbalization was associated with changes in performance. In contrast, the lack of effects of thinking aloud was merely suggested by a handful of experimental studies. In this article, Ericsson and Simon's model is tested by a meta-analysis of 94 studies comparing performance while giving concurrent verbalizations to a matching condition without verbalization. Findings based on nearly 3,500 participants show that the "think-aloud" effect size is indistinguishable from zero (r = -.03) and that this procedure remains nonreactive even after statistically controlling additional factors such as task type (primarily visual or nonvisual). In contrast, procedures that entail describing or explaining thoughts and actions are significantly reactive, leading to higher performance than silent control conditions. All verbal reporting procedures tend to increase times to complete tasks. These results suggest that think-aloud should be distinguished from other methods in future studies. Theoretical and practical implications are discussed.
Self-report measurements are ubiquitous in psychology, but they carry the potential of altering processes they are meant to measure. We assessed whether a common metamemory measure, judgments of learning, can change the ongoing process of memorizing and subsequent memory performance. Judgments of learning are a form of metamemory monitoring described as conscious reflection on one's own memory performance or encoding activities for the purpose of exerting strategic control over one's study and retrieval activities (T. O. Nelson & Narens, 1990). Much of the work examining the conscious monitoring of encoding relies heavily on a paradigm in which participants are asked to estimate the probability that they will recall a given item in a judgment of learning. In 5 experiments, we find effects of measuring judgments of learning on how people allocate their study time to difficult versus easy items, and on what they will recall. These results suggest that judgments of learning are partially constructed in response to the measurement question. The tendency of judgments of learning to alter performance places them in the company of other reactive verbal reporting methods, counseling researchers to consider incorporating control groups, creating alternative scales, and exploring other verbal reporting methods. Less directive methods of accessing participants' metacognition and other judgments should be considered as an alternative to response scales.
Secular gains in intelligence test scores have perplexed researchers since they were documented by Flynn (1984, 1987). Gains are most pronounced on abstract, so-called culture-free tests, prompting Flynn (2007) to attribute them to problem-solving skills availed by scientifically advanced cultures. We propose that recent-born individuals have adopted an approach to analogy that enables them to infer higher level relations requiring roles that are not intrinsic to the objects that constitute initial representations of items. This proposal is translated into item-specific predictions about differences between cohorts in pass rates and item-response patterns on the Raven's Matrices (Flynn, 1987), a seemingly culture-free test that registers the largest Flynn effect. Consistent with predictions, archival data reveal that individuals born around 1940 are less able to map objects at higher levels of relational abstraction than individuals born around 1990. Polytomous Rasch models verify predicted violations of measurement invariance, as raw scores are found to underestimate the number of analogical rules inferred by members of the earlier cohort relative to members of the later cohort who achieve the same overall score. The work provides a plausible cognitive account of the Flynn effect, furthers understanding of the cognition of matrix reasoning, and underscores the need to consider how test-takers select item responses.
Few studies have examined the impact of age on reactivity to concurrent think-aloud (TA) verbal reports. An initial study with 30 younger and 31 older adults revealed that thinking aloud improves older adult performance on a short form of the Raven's Matrices (Bors & Stokes, 1998, Educational and Psychological Measurement, 58, p. 382) but did not affect other tasks. In the replication experiment, 30 older adults (mean age = 73.0) performed the Raven's Matrices and three other tasks to replicate and extend the findings of the initial study. Once again older adults performed significantly better only on the Raven's Matrices while thinking aloud. Performance gains on this task were substantial (d = 0.73 and 0.92 in Experiments 1 and 2, respectively), corresponding to a fluid intelligence increase of nearly one standard deviation.
Schooler's (2011) commentary of our meta-analysis (Fox, Ericsson, & Best, 2011), although thoughtful and generally complimentary, misclassifies the think-aloud method as a form of introspection. Although he praised the scientific rigor of the think-aloud method, Schooler criticized its limitations as a mode of capturing the full range of conscious phenomena, especially nonverbal aspects of consciousness. He noted that reactive effects (changes in the accuracy of performance) are often observed when experimenters induce verbalization of "particularly ineffable experiences" (p. 347). In this reply, we show that thinking aloud is not introspective by establishing that it does not require inner observation or generation of descriptions and explanations. In contrast to introspective methods, thinking aloud involves only focusing on a challenging task while concurrently giving verbal expression to thoughts entering attention. Given that our meta-analysis found no significant reactivity for this type of verbalization but showed significant reactivity for instructions requesting explanations and detailed descriptions, we conclude that researchers now have a choice between 2 qualitatively distinct methodologies for studying thinking: introspective methods, which change observed performance and, by inference, task-related processes; or the think-aloud method, which changes neither.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.