2014
DOI: 10.1007/s11042-014-2044-9
|View full text |Cite
|
Sign up to set email alerts
|

Content-oriented multimedia document understanding through cross-media correlation

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
6
0

Year Published

2014
2014
2020
2020

Publication Types

Select...
4
1

Relationship

2
3

Authors

Journals

citations
Cited by 11 publications
(6 citation statements)
references
References 39 publications
0
6
0
Order By: Relevance
“…A more systematical and detailed introduction to the discussed techniques may be found in the references for image processing and machine vision in [48][49][50][51]. …”
Section: Discussionmentioning
confidence: 99%
See 2 more Smart Citations
“…A more systematical and detailed introduction to the discussed techniques may be found in the references for image processing and machine vision in [48][49][50][51]. …”
Section: Discussionmentioning
confidence: 99%
“…Essentially, multimodal video data originated from the same source tend to be correlated [1,2,48]. It means that different modalities can take a complementary role on solving video content analysis tasks, and the presence of one modality can help understand certain semantics of others.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…For example, the latent semantic analysis derived from language processing is proposed as an interesting solution to learn overlapped audio events [47]. Lu et al [22] propose a multimodal correlation network, in which audio-to-audio retrievals can be improved by incorporating visual image information. However, this area requires more studying to apply the techniques efficiently into auditory scene understanding.…”
Section: Related Workmentioning
confidence: 99%
“…In this paper, we propose a novel audio event recognition framework for acoustic scene understanding based on our previous work on sound classification [3,19], audio summarization [20,21] and audio-visual correlation [4,22]. The term auditory scene here refers to the acoustic modeling of a specific location or site such as home, bus station, restaurant and shopping mall, which is similar to what an image of the same location provides visually.…”
mentioning
confidence: 99%