2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2016
DOI: 10.1109/iros.2016.7759410
|View full text |Cite
|
Sign up to set email alerts
|

Online joint learning of object concepts and language model using multimodal hierarchical Dirichlet process

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
11
0

Year Published

2017
2017
2021
2021

Publication Types

Select...
4
1

Relationship

2
3

Authors

Journals

citations
Cited by 9 publications
(11 citation statements)
references
References 20 publications
0
11
0
Order By: Relevance
“…The point of this model is that multimodal categories are used for learning lexical information and vice versa. This way, the robot could acquire approximately 70 appropriate words through interaction with a user over 1 month [86]. This suggests that unsupervised learning, e.g., multimodal object categorization, can provide a basic internal representation system for primal language learning.…”
Section: Unsupervised Learning Viewpoint: From Multimodal Categorimentioning
confidence: 99%
“…The point of this model is that multimodal categories are used for learning lexical information and vice versa. This way, the robot could acquire approximately 70 appropriate words through interaction with a user over 1 month [86]. This suggests that unsupervised learning, e.g., multimodal object categorization, can provide a basic internal representation system for primal language learning.…”
Section: Unsupervised Learning Viewpoint: From Multimodal Categorimentioning
confidence: 99%
“…In the experiments, we used the dataset that was used in (Aoki et al, 2016). This dataset was composed of images, tactile sensor values, and sound data obtained by the robot from objects and utterances given by a human who teaches features of the objects.…”
Section: Datasetmentioning
confidence: 99%
“…Furthermore, the integrated model can generate images from the features that are estimated from the words using the MLDA. We used a multimodal dataset composed of images and teaching utterances, which were obtained by the robot observing the objects and the human teacher teaching the object features using speech (Aoki et al, 2016). Because the human teacher did not necessarily utter the words corresponding to the object labels, the teaching utterances included words that were not related to the labels, and speech recognition errors consequently occurred.…”
Section: Introductionmentioning
confidence: 99%
“…In particular, Araki et al (2012b) proposed online Multimodal Latent Dirichlet Allocation (oMLDA) to acquire object concepts in an online manner, and combined this with the Nested Pitman-Yor Language Model (NPYLM), making it possible to perform lexical acquisition of unknown words sequentially. Aoki et al (2016) constructed an algorithm that can infer an approximately global optimal solution by representing it as a single integrated model. The NPYLM is an unsu-pervised morphological analysis method based on a statistical model that enables word segmentation exclusively from phoneme sequences (Mochihashi et al 2009).…”
Section: Improvement Of Online Learning Based On Particle Filters In mentioning
confidence: 99%