Multimodal categorization by hierarchical dirichlet process

Nakamura,; Nagai,; Iwahashi,

doi:10.1109/iros.2011.6048371

Cited by 13 publications

(37 citation statements)

References 13 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Nakamura et al proposed a method to learn object concepts and word meanings from multimodal information and verbal information [10]. The method proposed in [10] is a categorization method based on multimodal latent Dirichlet allocation (MLDA) that enables the acquisition of object concepts from multimodal information, such as visual, auditory, and haptic information [13]. Araki et al addressed the development of a method combining unsupervised word segmentation from uttered sentences by a nested Pitman-Yor language model (NPYLM) [14] and the learning of object concepts by MLDA [11].…”

Section: A Lexical Acquisitionmentioning

confidence: 99%

Spatial Concept Acquisition for a Mobile Robot that Integrates Self-Localization and Unsupervised Word Discovery from Spoken Sentences

Taniguchi

Inamura

2016

IEEE Trans. Cogn. Dev. Syst.

View full text Add to dashboard Cite

Abstract-In this paper, we propose a novel unsupervised learning method for the lexical acquisition of words related to places visited by robots, from human continuous speech signals. We address the problem of learning novel words by a robot that has no prior knowledge of these words except for a primitive acoustic model. Further, we propose a method that allows a robot to effectively use the learned words and their meanings for self-localization tasks. The proposed method is nonparametric Bayesian spatial concept acquisition method (SpCoA) that integrates the generative model for self-localization and the unsupervised word segmentation in uttered sentences via latent variables related to the spatial concept. We implemented the proposed method SpCoA on SIGVerse, which is a simulation environment, and TurtleBot 2, which is a mobile robot in a real environment. Further, we conducted experiments for evaluating the performance of SpCoA. The experimental results showed that SpCoA enabled the robot to acquire the names of places from speech sentences. They also revealed that the robot could effectively utilize the acquired spatial concepts and reduce the uncertainty in self-localization.

show abstract

Section: A Lexical Acquisitionmentioning

confidence: 99%

Spatial Concept Acquisition for a Mobile Robot that Integrates Self-Localization and Unsupervised Word Discovery from Spoken Sentences

Taniguchi

Inamura

2016

IEEE Trans. Cogn. Dev. Syst.

View full text Add to dashboard Cite

show abstract

“…4, the BoMHDP model is an extension of the MHDP model [5]. In this figure, x s jn , x c jn , x a jn , and x h jn denote the n-th feature of the j-th object's SIFT, color, audio, and haptic information, respectively.…”

Section: Multimodal Informationmentioning

confidence: 99%

“…We also proposed a multimodal categorization method [5] based on the hierarchical Dirichlet process (HDP) [6] to solve the problem of pLSA and LDA requiring the number of categories in advance. In these reports, we showed that multimodal information can be used to form categories that seem natural to humans.…”

Section: Introductionmentioning

confidence: 99%

Bag of multimodal hierarchical dirichlet processes: Model of complex conceptual structure for intelligent robots

Nagaoka

Nagai

Iwahashi³

2012

2012 IEEE/RSJ International Conference on Intelligent Robots and Systems

View full text Add to dashboard Cite

The formation of categories, which constitutes the basis of developing concepts, requires multimodal information with a complex structure. We propose a model called the bag of multimodal hierarchical Dirichlet processes (BoMHDP), which enables robots to form a variety of multimodal categories. The BoMHDP model is a collection of a large number of MHDP models, each of which has a different set of weights for sensory information. The weights work to realize selective attention and enable the formation of various types of categories (e.g., object, haptic, and color). The BoMHDP model is an extension of the HDP, and categorization is unsupervised. However, categories that are not natural for humans are also formed. Therefore, only the significant categories are selected through interaction between the user and the robot. At the same time, words obtained during the interaction are connected to the categories. Finally, categories, which are represented by words, are selected. The BoMHDP model was implemented on a robot platform and a preliminary experiment was conducted to validate it. The results revealed that various categories can be formed with the BoMHDP model. We also analyzed the formed conceptual structure by using multidimensional scaling.The results indicate that the complex conceptual structure was represented reasonably well with the BoMHDP model.

show abstract

“…Comprehending the surrounding environment through processing and analyzing of sensory input is one of the most essential functions for robots and intelligent systems [1], [2]. For example, automatic cleaning robots are equipped with haptic sensors or infrared sensors to avoid obstacles in a domestic room [3], and telepresence robots have cameras and microphones to deliver environmental awareness to their operators as remote sensing devices [4].…”

Section: Introductionmentioning

confidence: 99%

“…These two frameworks are different in their primal objectives: (1) a computational efficiency or tractability, and (2) an overall optimization of the system. Figure 1 illustrates two kinds of frameworks: A cascaded framework prioritizes the computational efficiency by sequentially extracting multiple pieces of information with different subsystems.…”

Section: Introductionmentioning

confidence: 99%

Unified auditory functions based on Bayesian topic model

Otsuka

Ishiguro

Sawada

et al. 2012

2012 IEEE/RSJ International Conference on Intelligent Robots and Systems

View full text Add to dashboard Cite

Abstract-Existing auditory functions for robots such as sound source localization and separation have been implemented in a cascaded framework whose overall performance may be degraded by any failure in its subsystems. These approaches often require a careful and environment-dependent tuning for each subsystems to achieve better performance. This paper presents a unified framework for sound source localization and separation where the whole system is integrated as a Bayesian topic model. This method improves both localization and separation with a common configuration under various environments by iterative inference using Gibbs sampling. Experimental results from three environments of different reverberation times confirm that our method outperforms stateof-the-art sound source separation methods, especially in the reverberant environments, and shows localization performance comparable to that of the existing robot audition system.

show abstract

Multimodal categorization by hierarchical dirichlet process

Cited by 13 publications

References 13 publications

Spatial Concept Acquisition for a Mobile Robot that Integrates Self-Localization and Unsupervised Word Discovery from Spoken Sentences

Spatial Concept Acquisition for a Mobile Robot that Integrates Self-Localization and Unsupervised Word Discovery from Spoken Sentences

Bag of multimodal hierarchical dirichlet processes: Model of complex conceptual structure for intelligent robots

Unified auditory functions based on Bayesian topic model

Contact Info

Product

Resources

About