Multimedia event detection using visual concept signatures

Younessian, Ehsan; Quinn, Michael J.; Mitamura, Teruko; Hauptmann, Alex

doi:10.1117/12.2008425

Cited by 4 publications

(4 citation statements)

References 4 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In addition, it is also important to legitimately demonstrate the performance in multimedia retrieval by comparing with other related works, which also explore multi-modalities. For example, many researchers explored the cross-modal relationship by applying canonical correlation analysis to better perform in multimedia retrieval (Rasiwasia et al, 2010;Zhang & Liu, 2012), face recognition (Guam, Zhang, Luo, & Lan, 2012), event detection (Younessian, Quinn, Mitamura, & Hauptmann, 2013), etc. In addition, besides positive correlation, Zhai et al also pointed out the importance of capturing negative cross-modality correlation since it can provide exclusive information.…”

Section: Discussionmentioning

confidence: 99%

Content-Based Multimedia Retrieval Using Feature Correlation Clustering and Fusion

Chen

Fleites

2013

International Journal of Multimedia Data Engineering and Management

View full text Add to dashboard Cite

Nowadays, only processing visual features is not enough for multimedia semantic retrieval due to the complexity of multimedia data, which usually involve a variety of modalities, e.g. graphics, text, speech, video, etc. It becomes crucial to fully utilize the correlation between each feature and the target concept, the feature correlation within modalities, and the feature correlation across modalities. In this paper, the authors propose a Feature Correlation Clustering-based Multi-Modality Fusion Framework (FCC-MMF) for multimedia semantic retrieval. Features from different modalities are combined into one feature set with the same representation via a normalization and discretization process. Within and across modalities, multiple correspondence analysis is utilized to obtain the correlation between feature-value pairs, which are then projected onto the two principal components. K-medoids algorithm, which is a widely used partitioned clustering algorithm, is selected to minimize the Euclidean distance within the resulted clusters and produce high intra-correlated feature-value pair clusters. Majority vote is applied to subsequently decide which cluster each feature belongs to. Once the feature clusters are formed, one classifier is built and trained for each cluster. The correlation and confidence of each classifier are considered while fusing the classification scores, and mean average precision is used to evaluate the final ranked classification scores. Finally, the proposed framework is applied on NUS-wide Lite data set to demonstrate the effectiveness in multimedia semantic retrieval.

show abstract

Section: Discussionmentioning

confidence: 99%

Content-Based Multimedia Retrieval Using Feature Correlation Clustering and Fusion

Chen

Fleites

2013

International Journal of Multimedia Data Engineering and Management

View full text Add to dashboard Cite

show abstract

“…Particularly, in video retrieval tasks using only text-based query, we can retrieve visual concepts in a video using visual concept signature generated based on a given text query. For instance in [123], visual concept signature idea, explained in Section 6.2.1, was used to tackle the ad-hoc Multimedia Event Detection task. Multimedia Event Detection (MED) is a multimedia retrieval task with the goal of finding videos of a particular event (e.g.…”

Section: Early Fusion Evaluationmentioning

confidence: 99%

“…"getting a vehicle unstuck", "wedding", "making a sandwich", etc) in a large-scale internet video archive, given text descriptions of events. In [123], the test videos were retrieved based on their visual semantics using a Visual Concept Signature (VCS) generated for each event only derived from the event description provided as the query. Visual semantics are described using the Semantic Indexing (SIN) feature which represents the likelihood of predefined visual concepts in a video, similar…”

Section: Early Fusion Evaluationmentioning

confidence: 99%

See 1 more Smart Citation

A framework for associated news story retrieval

Younessian¹

View full text Add to dashboard Cite

Video retrieval-searching and retrieving videos relevant to a given query-is one of the most popular topics in both real life applications and multimedia research. Finding relevant video content is important for producers of television news, documentaries and commercials. Particularly, in news domain, hundreds of news stories in many different languages are being published everyday by the numerous news agencies and media houses. The huge number of published news stories brings enormous challenges in developing techniques for their efficient retrieval. In particular, there is the challenge of identifying two news clips that discuss the same story. Here, the visual information need not be similar enough for simple near-duplicate video detection algorithms to work. Although, visually two news stories might be different, they might be addressing the same main topic. We call such news stories as associated new stories and the main objective in this thesis is to identify such stories. Therefore, it is imperative that we resort to other modalities such as speech and text for robust retrieval of associated news stories. In the visual domain, associated news stories can be seen as duplicate, near-duplicate, partially near-duplicate videos or in more challenging Contents List of Figures viii List of Tables xi LIST OF FIGURES

show abstract