Proceedings of the 13th Annual ACM International Conference on Multimedia 2005
DOI: 10.1145/1101149.1101290
|View full text |Cite
|
Sign up to set email alerts
|

SEVA

Abstract: In this paper, we study how a sensor-rich world can be exploited by digital recording devices such as cameras and camcorders to improve a user's ability to search through a large repository of image and video files. We design and implement a digital recording system that records identities and locations of objects (as advertised by their sensors) along with visual images (as recorded by a camera). The process, which we refer to as sensor-enhanced video annotation (SEVA), combines a series of correlation, inter… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2010
2010
2020
2020

Publication Types

Select...
5
1
1

Relationship

1
6

Authors

Journals

citations
Cited by 30 publications
(2 citation statements)
references
References 36 publications
0
2
0
Order By: Relevance
“…The video annotation technique presented in [14] utilizes the redundancy among YouTube videos to find connection among videos and propagating tags among similar videos. The techniques presented in [13] [7] utilize the contextual meta-data acquired from the sensors on smart phones to generate video tags. In [9], the proposed technique recognizes basic objects in images and videos of a digital camera and extracts the meta data including geographical and date/time information to generate tags.…”
Section: Related Workmentioning
confidence: 99%
“…The video annotation technique presented in [14] utilizes the redundancy among YouTube videos to find connection among videos and propagating tags among similar videos. The techniques presented in [13] [7] utilize the contextual meta-data acquired from the sensors on smart phones to generate video tags. In [9], the proposed technique recognizes basic objects in images and videos of a digital camera and extracts the meta data including geographical and date/time information to generate tags.…”
Section: Related Workmentioning
confidence: 99%
“…This is not consistent with the reality: a semantic concept can have very different representation (for example, side view of a sedan and front view of a sedan can be very different). Trying to bridge the gap, some mobile information management systems use tags generated at the point of picture was taken: In Davis et al (2004), Liu et al (2005), and Lahti et al (2005), location, temporal, and sometimes social contextual metadata are used as image features instead of the visual features. However, a picture is worth a thousand words.…”
Section: Introductionmentioning
confidence: 99%