Mental Imagery for a Conversational Robot

Roy, Deb; Hsiao, Kaijen; Mavridis, Nikolaos

doi:10.1109/tsmcb.2004.823327

Cited by 92 publications

(61 citation statements)

References 28 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…There is a separate body of work on qualitative models of action effects on objects, rooted in naive physics (Hayes 1995) and qualitative physics (Kuipers 1986). In a similar spirit, there is work on using physics engines to learn qualitative action effects (Mugan and Kuipers 2012), and on high level planning of manipulation (Stilman and Kuffner 2008;Roy et al 2004) using qualitative action models. Some early ideas on push planning have reappeared in recent robots, which plan pushes to enable grasps in clutter (Dogar and Srinivasa 2010).…”

Section: Related Workmentioning

confidence: 99%

Learning modular and transferable forward models of the motions of push manipulated objects

et al. 2016

View full text Add to dashboard Cite

The ability to predict how objects behave during manipulation is an important problem. Models informed by mechanics are powerful, but are hard to tune. An alternative is to learn a model of the object's motion from data, to learn to predict. We study this for push manipulation. The paper starts by formulating a quasi-static prediction problem. We then pose the problem of learning to predict in two different frameworks: (i) regression and (ii) density estimation. Our architecture is modular: many simple, object specific, and context specific predictors are learned. We show empirically that such predictors outperform a rigid body dynamics engine tuned on the same data. We then extend the density estimation approach using a product of experts. This allows transfer of learned motion models to objects of novel shape, and to novel actions. With the right representation and learning method, these transferred models can match the prediction performance of a rigid body dynamics engine for novel objects or actions.

show abstract

Section: Related Workmentioning

confidence: 99%

Learning modular and transferable forward models of the motions of push manipulated objects

et al. 2016

View full text Add to dashboard Cite

show abstract

“…It is worth noting that proximity problems also arise in cognitive robotics, where a robot may need to anticipate whether it will be next to an object (e.g., a person, food tray or book) it plans to interact with, after one or more moves or other actions. The "next-to" problem was analyzed in (Schubert 1990(Schubert , 1994, and recent robotics research recognizes that cognitive robots need to construct three-dimensional spatial models of their environment (e.g., Roy et al 2004).…”

Section: The Need For Imagistic Modelingmentioning

confidence: 99%

On the need for imagistic modeling in story understanding

Bigelow

Scarafoni

Schubert

et al. 2015

Biologically Inspired Cognitive Architectures

View full text Add to dashboard Cite

There is ample evidence that human understanding of ordinary language relies in part on a rich capacity for imagistic mental modeling. We argue that genuine language understanding in machines will similarly require an imagistic modeling capacity enabling fast construction of instances of prototypical physical situations and events, whose participants are drawn from a wide variety of entity types, including animate agents. By allowing fast evaluation of predicates such as 'can-see', 'under', and 'inside', these model instances support coherent text interpretation. Imagistic modeling is thus a crucial -and not very broadly appreciated -aspect of the long-standing knowledge acquisition bottleneck in AI. We will illustrate how the need for imagistic modeling arises even in the simplest first-reader stories for children, and provide an initial feasibility study to indicate what the architecture of a system combining symbolic with imagistic understanding might look like.

show abstract

“…Joly et al applied this concept to logo retrieval in large image collection [13] and Jiang et al did this to bagof-visual-words [14]. As visual representational aspects, a visual mental imagery is used as inner representation of cognitive processes of humans [16], AIs [17] and even robots [18].…”

Section: Related Workmentioning

confidence: 99%

Visual Query Expansion via Incremental Hypernetwork Models of Image and Text

Heo

Kang

Zhang

2010

PRICAI 2010: Trends in Artificial Intelligence

View full text Add to dashboard Cite

Abstract. Humans can associate vision and language modalities and thus generate mental imagery, i.e. visual images, from linguistic input in an environment of unlimited inflowing information. Inspired by human memory, we separate a text-to-image retrieval task into two steps: 1) text-to-image conversion (generating visual queries for the 2 step) and 2) image-to-image retrieval task. This separation is advantageous for inner representation visualization, learning incremental dataset, using the results of content-based image retrieval. Here, we propose a visual query expansion method that simulates the capability of human associative memory. We use a hyperenetwork model (HN) that combines visual words and linguistic words. HNs learn the higher-order cross-modal associative relationships incrementally on a set of image-text pairs in sequence. An incremental HN generates images by assembling visual words based on linguistic cues. And we retrieve similar images with the generated visual query. The method is evaluated on 26 video clips of 'Thomas and Friends'. Experiments show the performance of successive image retrieval rate up to 98.1% with a single text cue. It shows the additional potential to generate the visual query with several text cues simultaneously.

show abstract

Mental Imagery for a Conversational Robot

Cited by 92 publications

References 28 publications

Learning modular and transferable forward models of the motions of push manipulated objects

Learning modular and transferable forward models of the motions of push manipulated objects

On the need for imagistic modeling in story understanding

Visual Query Expansion via Incremental Hypernetwork Models of Image and Text

Contact Info

Product

Resources

About