Using score distributions for query-time fusion in multimediaretrieval

Wilkins, Peter; Ferguson, Paul; Smeaton, Alan F.

doi:10.1145/1178677.1178687

Cited by 32 publications

(22 citation statements)

References 18 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…For each keyframe, the system extracts six low-level features: colour layout, colour histogram, edge histogram, Tamura texture feature histogram, colour and edge directivity descriptor (CEDD) [4], and fuzzy colour and texture histogram (FCTH) [5]. As a query-time fusion methodology for the different low-level features and visual examples, we use the method described in [24]. We discard the use of the ASR output and high-level concepts, as it would not drive any additional conclusions to our experiments; we assume these features are complementary to the low-level features obtained from the visual examples.…”

Section: Methodsmentioning

confidence: 99%

Exploiting semantics on external resources to gather visual examples for video retrieval

Vallet

Cantador

Jose

2012

Int J Multimed Info Retr

View full text Add to dashboard Cite

With the huge and ever rising amount of video content available on the Web, there is a need to facilitate video retrieval functionalities on very large collections. Most of the current Web video retrieval systems rely on manual textual annotations to provide keyword-based search interfaces. These systems have to face the problems that users are often reticent to provide annotations, and that the quality of such annotations is questionable in many cases. An alternative commonly used approach is to ask the user for an image example, and exploit the low-level features of the image to find video content whose keyframes are similar to the image. In this case, the main limitation is the so-called semantic gap, which consists of the fact that low-level image features often do not match with the real semantics of the videos. Moreover, this approach may be a burden to the user, as it requires finding and providing the system with relevant visual examples. Aiming to address this limitation, in this paper, we present a hybrid video retrieval technique that automatically obtains visual examples by performing textual searches on external knowledge sources, such as DBpedia, Flickr and Google Images, which have different coverage and structure characteristics. Our approach exploits the semantics underlying the above knowledge sources to address the semantic gap problem. We have conducted evaluations to assess the quality of visual examples retrieved from the above external knowledge sources.

show abstract

Section: Methodsmentioning

confidence: 99%

Exploiting semantics on external resources to gather visual examples for video retrieval

Vallet

Cantador

Jose

2012

Int J Multimed Info Retr

View full text Add to dashboard Cite

show abstract

“…For each keyframe, the system extracts six low-level features: colour layout, colour histogram, edge histogram, Tamura texture feature histogram, colour and edge directivity descriptor (CEDD) [3], and fuzzy colour and texture histogram (FCTH) [4]. As a query-time fusion methodology for the different low-level features and visual examples, we use the method described in [19]. We discard the use of the ASR output and high level concepts, as it would not drive any additional conclusions to our experiments; we assume these features are complementary to the low-level features obtained from the visual examples.…”

Section: Methodsmentioning

confidence: 99%

Exploiting external knowledge to improve video retrieval

Vallet

Cantador

Jose

2010

Proceedings of the International Conference on Multimedia Information Retrieval

View full text Add to dashboard Cite

Most video retrieval systems are multimodal, commonly relying on textual information, low-and high-level semantic features extracted from query visual examples. In this work, we study the impact of exploiting different knowledge sources in order to automatically retrieve query visual examples relevant to a video retrieval task. Our hypothesis is that the exploitation of external knowledge sources can help on the identification of query semantics as well as on improving the understanding of video contents.We propose a set of techniques to automatically obtain additional query visual examples from different external knowledge sources, such as DBPedia, Flickr and Google Images, which have different coverage and structure characteristics. The proposed strategies attempt to exploit the semantics underlying the above knowledge sources to reduce the ambiguity of the query, and to focus the scope of the image searches in the repositories.We assess and compare the quality of the images obtained from the different external knowledge sources when used as input of a number of video retrieval tasks. We also study how much they complement manually provided sets of examples, such as those given by TRECVid tasks.Based on our experimental results, we report which external knowledge source is more likely to be suitable for the evaluated retrieval tasks. Results also demonstrate that the use of external knowledge can be a good complement to manually provided examples and, when lacking of visual examples provided by a user, our proposed approaches can retrieve visual examples to improve the user's query.

show abstract

“…This approach is based on the observation that if one was to plot the normalized scores of an expert against that of scores of other experts used for a particular query, then the expert whose scores showed the greatest initial change tends to be the best performer for that query. While we acknowledge this observation is not universal, it has been shown empirically to improve retrieval performance [10]; we also used this technique for our participation in ImageCLEFPhoto 2007 [6].…”

Section: Retrievalmentioning

confidence: 99%

Diversity in Image Retrieval: DCU at ImageCLEFPhoto 2008

O’Hare

Wilkins

Gurrin

et al. 2009

Lecture Notes in Computer Science

Self Cite

View full text Add to dashboard Cite

Abstract. DCU participated in the ImageCLEF 2008 photo retrieval task, which aimed to evaluate diversity in Image Retrieval, submitting runs for both the English and Random language annotation conditions. Our approaches used text-based and image-based retrieval to give baseline runs, with the the highest-ranked images from these baseline runs clustered using K-Means clustering of the text annotations, with representative images from each cluster ranked for the final submission. For random language annotations, we compared results from translated runs with untranslated runs. Our results show that combining image and text outperforms text alone and image alone, both for general retrieval performance and for diversity. Our baseline image and text runs give our best overall balance between retrieval and diversity; indeed, our baseline text and image run was the 2nd best automatic run for ImageCLEF 2008 Photographic Retrieval task. We found that clustering consistently gives a large improvement in diversity performance over the baseline, unclustered results, while degrading retrieval performance. Pseudo relevance feedback consistently improved retrieval, but always at the cost of diversity. We also found that the diversity of untranslated random runs was quite close to that of translated random runs, indicating that for this dataset at least, if diversity is our main concern it may not be necessary to translate the image annotations.

show abstract

Using score distributions for query-time fusion in multimediaretrieval

Cited by 32 publications

References 18 publications

Exploiting semantics on external resources to gather visual examples for video retrieval

Exploiting semantics on external resources to gather visual examples for video retrieval

Exploiting external knowledge to improve video retrieval

Diversity in Image Retrieval: DCU at ImageCLEFPhoto 2008

Contact Info

Product

Resources

About