2011 IEEE/RSJ International Conference on Intelligent Robots and Systems 2011
DOI: 10.1109/iros.2011.6048371
|View full text |Cite
|
Sign up to set email alerts
|

Multimodal categorization by hierarchical dirichlet process

Abstract: Small informal meetings of two to four participants are very common in work environments. For this reason, a convenient way for recording and archiving these meetings is of great interest. In order to efficiently archive such meetings, an important task to address is to keep trace of "who talked when" during a meeting. This paper proposes a new multi-modal approach to tackle this speaker activity detection problem. One of the novelty of the proposed approach is that it uses a human tracker that relies on scann… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
36
0

Year Published

2012
2012
2023
2023

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 13 publications
(37 citation statements)
references
References 13 publications
1
36
0
Order By: Relevance
“…Nakamura et al proposed a method to learn object concepts and word meanings from multimodal information and verbal information [10]. The method proposed in [10] is a categorization method based on multimodal latent Dirichlet allocation (MLDA) that enables the acquisition of object concepts from multimodal information, such as visual, auditory, and haptic information [13]. Araki et al addressed the development of a method combining unsupervised word segmentation from uttered sentences by a nested Pitman-Yor language model (NPYLM) [14] and the learning of object concepts by MLDA [11].…”
Section: A Lexical Acquisitionmentioning
confidence: 99%
“…Nakamura et al proposed a method to learn object concepts and word meanings from multimodal information and verbal information [10]. The method proposed in [10] is a categorization method based on multimodal latent Dirichlet allocation (MLDA) that enables the acquisition of object concepts from multimodal information, such as visual, auditory, and haptic information [13]. Araki et al addressed the development of a method combining unsupervised word segmentation from uttered sentences by a nested Pitman-Yor language model (NPYLM) [14] and the learning of object concepts by MLDA [11].…”
Section: A Lexical Acquisitionmentioning
confidence: 99%
“…4, the BoMHDP model is an extension of the MHDP model [5]. In this figure, x s jn , x c jn , x a jn , and x h jn denote the n-th feature of the j-th object's SIFT, color, audio, and haptic information, respectively.…”
Section: Multimodal Informationmentioning
confidence: 99%
“…We also proposed a multimodal categorization method [5] based on the hierarchical Dirichlet process (HDP) [6] to solve the problem of pLSA and LDA requiring the number of categories in advance. In these reports, we showed that multimodal information can be used to form categories that seem natural to humans.…”
Section: Introductionmentioning
confidence: 99%
“…Comprehending the surrounding environment through processing and analyzing of sensory input is one of the most essential functions for robots and intelligent systems [1], [2]. For example, automatic cleaning robots are equipped with haptic sensors or infrared sensors to avoid obstacles in a domestic room [3], and telepresence robots have cameras and microphones to deliver environmental awareness to their operators as remote sensing devices [4].…”
Section: Introductionmentioning
confidence: 99%
“…These two frameworks are different in their primal objectives: (1) a computational efficiency or tractability, and (2) an overall optimization of the system. Figure 1 illustrates two kinds of frameworks: A cascaded framework prioritizes the computational efficiency by sequentially extracting multiple pieces of information with different subsystems.…”
Section: Introductionmentioning
confidence: 99%