2021
DOI: 10.48550/arxiv.2111.08826
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

A Benchmark for Modeling Violation-of-Expectation in Physical Reasoning Across Event Categories

Abstract: Recent work in computer vision and cognitive reasoning has given rise to an increasing adoption of the Violationof-Expectation (VoE) paradigm in synthetic datasets. Inspired by infant psychology, researchers are now evaluating a model's ability to label scenes as either expected or surprising with knowledge of only expected scenes. However, existing VoE-based 3D datasets in physical reasoning provide mainly vision data with little to no heuristics or inductive biases. Cognitive models of physical reasoning rev… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(3 citation statements)
references
References 41 publications
0
3
0
Order By: Relevance
“…However, a number of benchmarks that aim to evaluate the general ability of AI systems to reason about unexpected events or situations have recently emerged (Riochet et al 2020;Shu et al 2021;Gandhi et al 2021;Dasgupta et al 2021;Piloto et al 2022;Weihs et al 2022). Most of these benchmarks focus on intuitive physics, targeting physical concepts such as continuity, solidity, object persistence and gravity (Smith et al 2019;Riochet et al 2020;Dasgupta et al 2021;Piloto et al 2022;Weihs et al 2022). Conversely, benchmarks that test models' ability to reason about other agents have received less attention.…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…However, a number of benchmarks that aim to evaluate the general ability of AI systems to reason about unexpected events or situations have recently emerged (Riochet et al 2020;Shu et al 2021;Gandhi et al 2021;Dasgupta et al 2021;Piloto et al 2022;Weihs et al 2022). Most of these benchmarks focus on intuitive physics, targeting physical concepts such as continuity, solidity, object persistence and gravity (Smith et al 2019;Riochet et al 2020;Dasgupta et al 2021;Piloto et al 2022;Weihs et al 2022). Conversely, benchmarks that test models' ability to reason about other agents have received less attention.…”
Section: Related Workmentioning
confidence: 99%
“…Previous research in machine common-sense reasoning has mainly focused on evaluating language processing (Bhagavatula et al 2020;Huang et al 2019;Zellers et al 2019;Bisk et al 2020;Sap et al 2019;Sakaguchi et al 2021) or visual scene understanding (Yi et al 2020;Smith et al 2019;Ates et al 2022) in a task-specific manner. More recently, several new benchmarks have been introduced that allow to assess the more general ability of AI systems to reason about unexpected events or situations (Riochet et al 2020;Gandhi et al 2021;Dasgupta et al 2021;Shu et al 2021;Piloto et al 2022;Weihs et al 2022). Most of them focus on intuitive physics (Yi et al 2020;Smith et al 2019;Ates et al 2022;Riochet et al 2020;Dasgupta et al 2021;Piloto et al 2022;Weihs et al 2022;Piloto et al 2018) in which computational models have to reason about properties and interactions of physical macroscopic objects.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation