Artificial intelligence (AI) promises to take the flawed intelligence of humans out of machines. Why, then, might we want to put the inchoate intelligence of human infants into machines? While infants seem to intuit others’ underlying intentions merely by observing their actions, AI systems, in contrast, fall short in such commonsense psychology. Here we put infant and machine intelligence into direct dialogue through their performance on the Baby Intuitions Benchmark (BIB), a comprehensive suite of tasks probing commonsense psychology. Following a preregistered design and analysis plan, we collected 288 individual responses of 11-month-old infants to BIB’s six tasks and tested three state-of-the-art learning-driven neural-network models from two different model classes. Infants’ performance revealed their comprehensive understanding of agents as rational and goal-directed, but the models failed to capture infants’ knowledge. These striking differences between human and artificial intelligence are critical to address to build machine common sense.
Strong inductive biases allow children to learn in fast and adaptable ways. Children use the mutual exclusivity (ME) bias to help disambiguate how words map to referents, assuming that if an object has one label then it does not need another. In this paper, we investigate whether or not standard neural architectures have a ME bias, demonstrating that they lack this learning assumption. Moreover, we show that their inductive biases are poorly matched to early-phase learning in several standard tasks: machine translation and object recognition. There is a compelling case for designing neural networks that reason by mutual exclusivity, which remains an open challenge.
To achieve human-like common sense about everyday life, machine learning systems must understand and reason about the goals, preferences, and actions of others. Human infants intuitively achieve such common sense by making inferences about the underlying causes of other agents' actions. Directly informed by research on infant cognition, our benchmark BIB challenges machines to achieve generalizable, common-sense reasoning about other agents like human infants do. As in studies on infant cognition, moreover, we use a violation of expectation paradigm in which machines must predict the plausibility of an agent's behavior given a video sequence, making this benchmark appropriate for direct validation with human infants in future studies. We show that recently proposed, deep-learning-based agency reasoning models fail to show infant-like reasoning, leaving BIB an open challenge.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.