Abstract Spatial-Temporal Reasoning via Probabilistic Abduction and Execution

Zhang, Chi; Jia, Baoxiong; Zhu, Song‐Chun; Zhu, Yixin

doi:10.1109/cvpr46437.2021.00961

Cited by 30 publications

(33 citation statements)

References 35 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The context trials are leveraged to learn a generalized SEM, and the answer to a query trial is solved by finding the best value to fit the SEM. As the first attempt, we separately train the two components, leaving the problem of closing the loop between visual perception and causal discovery for future work [39,74,75].…”

Section: Discussionmentioning

confidence: 99%

“…Despite remarkable results in other visual reasoning tasks, we notice that pure neural networks [8,28,55,68,77] favor a covariation-based reasoning strategy and thus can only achieve performance marginally above the chance level. As the first attempt in the exploration to empower visual reasoning systems for causal induction, we resort to neurosymbolic models [26,39,43,50,51,70,71,74,76] that combine neural visual processing [27] and symbolic causal reasoning [18,49,53,62,78,79], which turn out to struggle in backward-blocking cases in abstract causal reasoning.…”

Section: At What Level Do Current Visual Reasoning Systems Induce Cau...mentioning

confidence: 99%

“…Zheng et al [77] formulated the problem as teacher-student learning, Wang et al [68] used a multiplex graph model to capture the hidden relations, and Spratley et al [63] revisited ResNet models combined with unsupervised learning. More recently, Zhang et al [74] disentangled perception and reasoning from a monolithic model, wherein the visual perception frontend predicts objects' attributes, later aggregated by a scene inference engine to produce a probabilistic scene representation, and the symbolic logical reasoning backend abduces the hidden rules.…”

Section: Related Workmentioning

confidence: 99%

“…Specifically, we draw inspirations from recent advances in neuro-symbolic literature [26,39,43,50,51,70,71,74,76] and decompose our model into a neural perception frontend and a causal reasoning backend. By design, the frontend is responsible for parsing each context trial to form an object-based representation, whereas the backend takes the symbolic output from the frontend and performs causal induction; see an overview of the method in Fig.…”

Section: Neuro-symbolic Modelsmentioning

confidence: 99%

See 3 more Smart Citations

ACRE: Abstract Causal REasoning Beyond Covariation

Zhang¹,

Jia²,

Edmonds³

et al. 2021

Preprint

Self Cite

View full text Add to dashboard Cite

Causal induction, i.e., identifying unobservable mechanisms that lead to the observable relations among variables, has played a pivotal role in modern scientific discovery, especially in scenarios with only sparse and limited data. Humans, even young toddlers, can induce causal relationships surprisingly well in various settings despite its notorious difficulty. However, in contrast to the commonplace trait of human cognition is the lack of a diagnostic benchmark to measure causal induction for modern Artificial Intelligence (AI) systems. Therefore, in this work, we introduce the Abstract Causal REasoning (ACRE) dataset for systematic evaluation of current vision systems in causal induction. Motivated by the stream of research on causal discovery in Blicket experiments, we query a visual reasoning system with the following four types of questions in either an independent scenario or an interventional scenario: direct, indirect, screening-off, and backward-blocking, intentionally going beyond the simple strategy of inducing causal relationships by covariation. By analyzing visual reasoning architectures on this testbed, we notice that pure neural models tend towards an associative strategy under their chance-level performance, whereas neuro-symbolic combinations struggle in backward-blocking reasoning. These deficiencies call for future research in models with a more comprehensive capability of causal induction.

show abstract

Section: Discussionmentioning

confidence: 99%

Section: At What Level Do Current Visual Reasoning Systems Induce Cau...mentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

Section: Neuro-symbolic Modelsmentioning

confidence: 99%

See 2 more Smart Citations

ACRE: Abstract Causal REasoning Beyond Covariation

Zhang¹,

Jia²,

Edmonds³

et al. 2021

Preprint

Self Cite

View full text Add to dashboard Cite

show abstract

“…Alternatively, instead of explicitly selecting an answer that correctly completes the matrix from a set of choices, a generative model may be considered that recreates the missing image (or part of it). Such a target task lies at the core of problems in DOPT and is further considered in the context of RPMs, where a generative model first recreates the whole missing panel and then selects the most similar one from the set of pre-defined choices [98,99].…”

Section: Generationmentioning

confidence: 99%

A Review of Emerging Research Directions in Abstract Visual Reasoning

Małkiński¹,

Mańdziuk²

2022

Preprint

View full text Add to dashboard Cite

Visual Reasoning (AVR) problems are commonly used to approximate human intelligence. They test the ability of applying previously gained knowledge, experience and skills in a completely new setting, which makes them particularly well-suited for this task. Recently, the AVR problems have become popular as a proxy to study machine intelligence, which has led to emergence of new distinct types of problems and multiple benchmark sets. In this work we review this emerging AVR research and propose a taxonomy to categorise the AVR tasks along 5 dimensions: input shapes, hidden rules, target task, cognitive function, and main challenge. The perspective taken in this survey allows to characterise AVR problems with respect to their shared and distinct properties, provides a unified view on the existing approaches for solving AVR tasks, shows how the AVR problems relate to practical applications, and outlines promising directions for future work. One of them refers to the observation that in the machine learning literature different tasks are considered in isolation, which is in the stark contrast with the way the AVR tasks are used to measure human intelligence, where multiple types of problems are combined within a single IQ test.

show abstract

Patching interpretable And‐Or‐Graph knowledge representation using augmented reality

Liu

Zhu

2021

Applied AI Letters

Self Cite

View full text Add to dashboard Cite

We present a novel augmented reality (AR) interface to provide effective means to diagnose a robot's erroneous behaviors, endow it with new skills, and patch its knowledge structure represented by an And-Or-Graph (AOG). Specifically, an AOG representation of opening medicine bottles is learned from human demonstration and yields a hierarchical structure that captures the spatiotemporal compositional nature of the given task, which is highly interpretable for the users. Through a series of psychological experiments, we demonstrate that the explanations of a robotic system, inherited from and produced by the AOG, can better foster human trust compared to other forms of explanations. Moreover, by visualizing the knowledge structure and robot states, the AR interface allows human users to intuitively understand what the robot knows, supervise the robot's task planner, and interactively teach the robot with new actions. Together, users can quickly identify the reasons for failures and conveniently patch the current knowledge structure to prevent future errors. This capability demonstrates the interpretability of our knowledge representation and the new forms of interactions afforded by the proposed AR interface.augmented reality (AR), explainable artificial intelligence (XAI), robot learning | INTRODUCTIONThe ever-growing vast amount of data and rapid-increasing computing power have recently boosted a data-driven machine learning paradigm. Despite promising progress in model performance, such pure data-driven approaches, especially methods using deep neural networks, have one well-known limitation-the lack of interpretability. Various attempts, therefore, have been made to alleviate this shortcoming, such as visualizing filter responses, 1-3 developing communication protocols, [4][5][6][7] and generating text descriptions for images, 8,9 or robot behaviors. 10,11 However, these explanation mechanisms only suffice with two-dimensional (2D) interactions (ie, using computer screens) and fall short of interacting with physical robots, especially in some mission-critical settings that require supervising multiple robots. Therefore, an interpretable knowledge representation for robots and an effective explanation interface beyond 2D are needed for better situational awareness, richer spatial information, and in situ explanations during human-robot interaction. Of note, less attention has been paid, for physical robots, in introducing

show abstract

Abstract Spatial-Temporal Reasoning via Probabilistic Abduction and Execution

Cited by 30 publications

References 35 publications

ACRE: Abstract Causal REasoning Beyond Covariation

ACRE: Abstract Causal REasoning Beyond Covariation

A Review of Emerging Research Directions in Abstract Visual Reasoning

Patching interpretable And‐Or‐Graph knowledge representation using augmented reality

Contact Info

Product

Resources

About