Proceedings of the 1st ACM International Conference on Multimedia Information Retrieval 2008
DOI: 10.1145/1460096.1460100
|View full text |Cite
|
Sign up to set email alerts
|

Combining image descriptors to effectively retrieve events from visual lifelogs

Abstract: The SenseCam is a wearable camera that passively captures approximately 3,000 images per day, which equates to almost one million images per year. It is used to create a personal visual recording of the wearer's life and generates information which can be helpful as a human memory aid. For such a large amount of visual information to be of any use, it is accepted that it should be structured into "events", of which there are about 8,000 in a wearer's average year. In automatically segmenting SenseCam images in… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

1
27
0
3

Year Published

2009
2009
2018
2018

Publication Types

Select...
5
4
1

Relationship

3
7

Authors

Journals

citations
Cited by 40 publications
(31 citation statements)
references
References 22 publications
1
27
0
3
Order By: Relevance
“…Our answer is that while we observed evidence of event boundary decay (only 63% overlap), we do not believe that this poses as large a challenge for maintaining a long-term lifelog as we had expected, e.g. for determining event boundaries automatically [9], determining the interestingness of events [10], or indeed in retrieving events from a lifelog [8](Section 5.3)…”
Section: Discussionmentioning
confidence: 78%
“…Our answer is that while we observed evidence of event boundary decay (only 63% overlap), we do not believe that this poses as large a challenge for maintaining a long-term lifelog as we had expected, e.g. for determining event boundaries automatically [9], determining the interestingness of events [10], or indeed in retrieving events from a lifelog [8](Section 5.3)…”
Section: Discussionmentioning
confidence: 78%
“…A 'merging' fusion combined with a support vector machine (SVM) classifier, a back-propagation fusion with a KNN classifier and a Fuzzy-ART neurofuzzy network strategy are explored, which can be extended in matching the segments of an image with predefined object models. The fusion (baseline fusion and score fusion) of MPEG-7, SIFT and SURF is also explored and evaluated in [32] to address content-based event search. The detailed results conclude that the MPEG-7, SIFT and SURF are broadly comparable and also highly complementary.…”
Section: Related Workmentioning
confidence: 99%
“…Doherty et al [4] augmented streams of Lifelog images with geographic data by including locational information provided by a GPS unit rather than estimated from the visual content. Torre et al [19] recorded the actions and displacements of several subjects using fixed cameras, motion capture, inertial sensors and head-worn narrow angle cameras, that can provide precise data on the movements and ongoing instrumental activites in a restricted place.…”
Section: Related Workmentioning
confidence: 99%