An effective coherence measure to determine topical consistency in user-generated content

He, Jiyin; Weerkamp, Wouter; Larson, Martha; Rijke, Maarten de

doi:10.1007/s10032-009-0089-5

Cited by 21 publications

(18 citation statements)

References 22 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Normalized TF video representation appears to be more robust to parameter setting than TF-IDF, since it shows consistent improvement for various values of parameter θ . In [8], the suggested parameter value is 95 %, but here it seems that the indicator calculated on concept-based features may be even more robust than the one calculated using conventional (text-based) TF or TF-IDF document representations. Regarding the choice for x cv , we investigate for which choice statistically significant improvements are obtained.…”

Section: Robustness To Parameter Settingmentioning

confidence: 96%

“…The approach is low in computational complexity and requires no labeled training data. Further, the coherence-based approach is appealing, because it goes beyond measuring the similarity of the top documents in a results list to measuring their topical clustering structure [8]. The coherence score is thus able to identify a results list as high-quality even in the face of relatively large diversity among the topical clusters in the top of results list.…”

Section: Query Performance Predictionmentioning

confidence: 99%

“…In our recent work [23], we demonstrated the performance of the coherence score defined in [8] and two light-weight alternatives for the task of text-based QES. Subsequently, we carried out initial work, reported briefly in [22,24], which established the potential of coherence score to be useful for multimodal QES.…”

Section: Query Performance Predictionmentioning

confidence: 99%

“…The coherence indicator [8] is used to select the results list with the highest coherence among the top-N retrieved results. The indicator is computed according to (9) as the ratio of video pairs in the top-N results whose similarity is larger than a threshold θ.…”

Section: Coherence Indicatormentioning

confidence: 99%

See 3 more Smart Citations

Leveraging visual concepts and query performance prediction for semantic-theme-based video retrieval

Rudinac

Larson

Hanjalic

2012

Int J Multimed Info Retr

Self Cite

View full text Add to dashboard Cite

In this paper, we present a novel approach that utilizes noisy shot-level visual concept detection to improve text-based video retrieval. As opposed to most of the related work in the field, we consider entire videos as the retrieval units and focus on queries that address a general subject matter (semantic theme) of a video. Retrieval is performed using a coherence-based query performance prediction framework. In this framework, we make use of video representations derived from the visual concepts detected in videos to select the best possible search result given the query, video collection, available search mechanisms and the resources for query modification. In addition to investigating the potential of this approach to outperform typical text-based video retrieval baselines, we also explore the possibility to achieve further improvement in retrieval performance through combining our concept-based query performance indicators with the indicators utilizing the spoken content of the videos. The proposed retrieval approach is data driven, requires no prior training and relies exclusively on the analyses of the video collection and different results lists returned for the given query text. The experiments are performed on the Media Eval 2010 datasets and demonstrate the effectiveness of our approach.

show abstract

Section: Robustness To Parameter Settingmentioning

confidence: 96%

Section: Query Performance Predictionmentioning

confidence: 99%

Section: Query Performance Predictionmentioning

confidence: 99%

Section: Coherence Indicatormentioning

confidence: 99%

See 2 more Smart Citations

Leveraging visual concepts and query performance prediction for semantic-theme-based video retrieval

Rudinac

Larson

Hanjalic

2012

Int J Multimed Info Retr

Self Cite

View full text Add to dashboard Cite

show abstract

“…This representation is deployed by a QPP framework to evaluate the coherence (e.g. [29]) of the candidate video search list and to select the list which is most likely to respond to a given topical query. The potential power of this hybrid solution can be observed from the fact that the proposed approach is able to select the most suitable video search list for 30% more queries than in the cases where only textual information is used to compare the videos.…”

Section: Advanced Semantic Inference: Inferring the Aboutness Of The mentioning

confidence: 99%

New grand challenge for multimedia information retrieval: bridging the utility gap

Hanjalic

2012

Int J Multimed Info Retr

View full text Add to dashboard Cite

This is the author-created version of the article. The final publication is available at www.springerlink.com.Abstract The needs and expectations regarding multimedia content access have grown rapidly with the fast development of multimedia technology and the explosion of multimedia content around us. This imposed high demands on the level of sophistication of multimedia information retrieval (MIR) solutions. Although the potential to develop the MIR technology that meets such high demands has also rapidly grown over the years, we are not there yet with adequate solutions. This paper states that a significant step forward could become possible if the MIR field moves towards a utilitycentered research focus. There, the criteria related to utility should be deployed to help us bridge the critical remaining gap that is in front of us -the utility gap, the gap between the expected and de facto usefuleness of MIR systems. Utility criteria reach beyond the objective relevance of MIR results to also consider their informativeness and how helpful they are for user's further actions. Bridging the utility gap can therefore be seen as the next grand challenge in the MIR research field. To pursue this challenge, we propose a utility-by-design approach, by which utility is targeted explicitly and embedded deep in the foundations of MIR solutions. The paper will first motivate this new MIR grand challenge and position it in respect to the current efforts in the field. Then, some possibilities for realizing the utilityby-design approach will be highlighted and translated into a number of recommended research directions.

show abstract

Result diversification based on query‐specific cluster ranking

Meij

Rijke

2011

J. Am. Soc. Inf. Sci.

Self Cite

View full text Add to dashboard Cite

Result diversification is a retrieval strategy for dealing with ambiguous or multi-faceted queries by providing documents that cover as many facets of the query as possible. We propose a result diversification framework based on query-specific clustering and cluster ranking, in which diversification is restricted to documents belonging to clusters that potentially contain a high percentage of relevant documents. Empirical results show that the proposed framework improves the performance of several existing diversification methods. The framework also gives rise to a simple yet effective cluster-based approach to result diversification that selects documents from different clusters to be included in a ranked list in a round robin fashion. We describe a set of experiments aimed at thoroughly analyzing the behavior of the two main components of the proposed diversification framework, ranking and selecting clusters for diversification. Both components have a crucial impact on the overall performance of our framework, but ranking clusters plays a more important role than selecting clusters. We also examine properties that clusters should have in order for our diversification framework to be effective. Most relevant documents should be contained in a small number of high-quality clusters, while there should be no dominantly large clusters. Also, documents from these high-quality clusters should have a diverse content. These properties are strongly correlated with the overall performance of the proposed diversification framework. IntroductionQueries submitted to Web search engines are often ambiguous or multi-faceted in the sense that they have multiple interpretations or sub-topics (Allan & Raghavan, 2002). For ambiguous queries, a typical example is the query "jaguar" that can refer to several interpretations including a kind of animal, a car brand, a type of cocktail, an operating system, etc. Multi-faceted queries are even more commonly seen in practice; for example, for the interpretation "jaguar car" of the query "jaguar", a wide range of sub-topics may be covered: models, prices, history of the company, etc. For such queries we often cannot be certain what the searcher's underlying information need is because of a lack of context. One retrieval strategy that attempts to cater for multiple interpretations of an ambiguous or multi-faceted query is to diversify the search results (Boyce, 1982;Goffman, 1964). Without explicit or implicit user feedback or history, the retrieval system makes an educated guess as to the possible facets of the query and presents as diverse a result list as possible by including documents pertaining to different facets of the query within the top-ranked documents.Recently, various result diversification methods have been proposed (Agrawal, Gollapudi, Halverson, & Ieong, 2009;Carbonell & Goldstein, 1998;Carterette & Chandar, 2009;Chen & Karger, 2006;Radlinski, Kleinberg, & Joachims, 2008;Santos, Macdonald, & Ounis, 2010;Zhai, Cohen, & Lafferty, 2003). Traditional retrieval strategies such ...

show abstract

An effective coherence measure to determine topical consistency in user-generated content

Cited by 21 publications

References 22 publications

Leveraging visual concepts and query performance prediction for semantic-theme-based video retrieval

Leveraging visual concepts and query performance prediction for semantic-theme-based video retrieval

New grand challenge for multimedia information retrieval: bridging the utility gap

Result diversification based on query‐specific cluster ranking

Contact Info

Product

Resources

About