Qualitative approaches to cognitive rigor and depth and complexity are broadly represented by Webb's Depth of Knowledge and Bloom's Taxonomy. Quantitative approaches have been relatively scant, and some have been based on ancillary measures such as the thinking time expended to answer test items. In competitive chess and other games amenable to incremental search and expert evaluation of options, we show how depth and complexity can be quantified naturally. We synthesize our depth and complexity metrics for chess into measures of difficulty and discrimination, and analyze thousands of games played by humans and computers by these metrics. We show the extent to which human players of various skill levels evince shallow versus deep thinking, and how they cope with 'difficult' versus 'easy' move decisions. The goal is to transfer these measures and results to application areas such as multiple-choice testing that enjoy a close correspondence in form and item values to the problem of finding good moves in chess positions.
Abstract-Inferences about structured patterns in human decision making have been drawn from medium-scale simulated competitions with human subjects. The concepts analyzed in these studies include level-k thinking, satisficing, and other human error tendencies. These concepts can be mapped via a natural depth of search metric into the domain of chess, where copious data is available from hundreds of thousands of games by players of a wide range of precisely known skill levels in real competitions. The games are analyzed by strong chess programs to produce authoritative utility values for move decision options by progressive deepening of search. Our experiments show a significant relationship between the formulations of level-k thinking and the skill level of players. Notably, the players are distinguished solely on moves where they erred-according to the average depth level at which their errors are exposed by the authoritative analysis. Our results also indicate that the decisions are often independent of tail assumptions on higher-order beliefs. Further, we observe changes in this relationship in different contexts, such as minimal versus acute time pressure. We try to relate satisficing to insufficient level of reasoning and answer numerically the question, why do humans blunder?
Abstract-We build a model for the kind of decision making involved in games of strategy such as chess, making it abstract enough to remove essentially all game-specific contingency, and compare it to known psychometric models of test taking, item response, and performance assessment. Decisions are modeled in terms of fallible agents Z faced with possible actions ai whose utilities ui = u(ai) are not fully apparent. The three main goals of the model are prediction, meaning to infer probabilities pi for Z to choose ai; intrinsic rating, meaning to assess the skill of a person's actual choices ai t over various test items t; and simulation of the distribution of choices by an agent with a specified skill set. We describe and train the model on large data from chess tournament games of different ranks of players, and exemplify its accuracy by applying it to give intrinsic ratings for world championship matches.
Research on judging decisions made by fallible (human) agents is not as much advanced as research on finding optimal decisions. Human decisions are often influenced by various factors, such as risk, uncertainty, time pressure, and depth of cognitive capability, whereas decisions by an intelligent agent (IA) can be effectively optimal without these limitations. The concept of 'depth', a well-defined term in game theory (including chess), does not have a clear formulation in decision theory. To quantify 'depth' in decision theory, we can configure an IA of supreme competence to 'think' at depths beyond the capability of any human, and in the process collect evaluations of decisions at various depths. One research goal is to create an intrinsic measure of the depth of thinking required to answer certain test questions, toward a reliable means of assessing their difficulty apart from item-response statistics. We relate the depth of cognition by humans to depths of search, and use this information to infer the quality of decisions made, so as to judge the decision-maker from his decisions. We use large data from real chess tournaments and evaluations from chess programs (AI agents) of strength beyond all human players. We then seek to transfer the results to other decision-making fields in which effectively optimal judgments can be obtained from either hindsight, answer banks, powerful AI agents or from answers provided by judges of various competency.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.