This paper provides an overview of a pilot evaluation of video summaries using rushes from several BBC dramatic series. It was carried out under the auspices of TRECVID. Twenty-two research teams submitted video summaries of up to 4% duration, of 42 individual rushes video files aimed at compressing out redundant and insignificant material. The output of two baseline systems built on straightforward content reduction techniques was contributed by Carnegie Mellon University as a control. Procedures for developing ground truth lists of important segments from each video were developed at Dublin City University and applied to the BBC video. At NIST each summary was judged by three humans with respect to how much of the ground truth was included, how easy the summary was to understand, and how much repeated material the summary contained. Additional objective measures included: how long it took the system to create the summary, how long it took the assessor to judge it against the ground truth, and what the summary's duration was. Assessor agreement on finding desired segments averaged 78% and results indicate that while it is difficult to exceed the performance of baselines, a few systems did.
Summary. *Successful and effective content-based access to digital video requires fast, accurate and scalable methods to determine the video content automatically. A variety of contemporary approaches to this rely on text taken from speech within the video, or on matching one video frame against others using low-level characteristics like colour, texture, or shapes, or on determining and matching objects appearing within the video. Possibly the most important technique, however, is one which determines the presence or absence of a high-level or semantic feature, within a video clip or shot. By utilizing dozens, hundreds or even thousands of such semantic features we can support many kinds of content-based video navigation. Critically however, this depends on being able to determine whether each feature is or is not present in a video clip. The last 5 years have seen much progress in the development of techniques to determine the presence of semantic features within video. This progress can be tracked in the annual TRECVid benchmarking activity where dozens of research groups measure the effectiveness of their techniques on common data and using an open, metrics-based approach. In this chapter we summarise the work done on the TRECVid high-level feature task, showing the progress made year-on-year. This provides a fairly comprehensive statement on where the state-of-the-art is regarding this important task, not just for one research group or for one approach, but across the spectrum. We then use this past and on-going work as a basis for highlighting the trends that are emerging in this area, and the questions which remain to be addressed before we can achieve large-scale, fast and reliable high-level feature detection on video. 4 Published in A. Divakaran (ed.), Multimedia Content Analysis, Signals and Communication Technology,
BackgroundOver the past three decades the global prevalence of childhood overweight and obesity has increased by 47%. Marketing of energy-dense nutrient-poor foods and beverages contributes to this worldwide increase. Previous research on food marketing to children largely uses self-report, reporting by parents, or third-party observation of children’s environments, with the focus mostly on single settings and/or media. This paper reports on innovative research, Kids’Cam, in which children wore cameras to examine the frequency and nature of everyday exposure to food marketing across multiple media and settings.MethodsKids’Cam was a cross-sectional study of 168 children (mean age 12.6 years, SD = 0.5) in Wellington, New Zealand. Each child wore a wearable camera on four consecutive days, capturing images automatically every seven seconds. Images were manually coded as either recommended (core) or not recommended (non-core) to be marketed to children by setting, marketing medium, and product category. Images in convenience stores and supermarkets were excluded as marketing examples were considered too numerous to count.ResultsOn average, children were exposed to non-core food marketing 27.3 times a day (95% CI 24.8, 30.1) across all settings. This was more than twice their average exposure to core food marketing (12.3 per day, 95% CI 8.7, 17.4). Most non-core exposures occurred at home (33%), in public spaces (30%) and at school (19%). Food packaging was the predominant marketing medium (74% and 64% for core and non-core foods) followed by signs (21% and 28% for core and non-core). Sugary drinks, fast food, confectionary and snack foods were the most commonly encountered non-core foods marketed. Rates were calculated using Poisson regression.ConclusionsChildren in this study were frequently exposed, across multiple settings, to marketing of non-core foods not recommended to be marketed to children. The study provides further evidence of the need for urgent action to reduce children’s exposure to marketing of unhealthy foods, and suggests the settings and media in which to act. Such action is necessary if the Commission on Ending Childhood Obesity’s vision is to be achieved.Electronic supplementary materialThe online version of this article (doi:10.1186/s12966-017-0570-3) contains supplementary material, which is available to authorized users.
This methodology enabled objective analysis of the world in which children live. The main arm examined the frequency and nature of children's exposure to food and beverage marketing and provided data on difficult to measure settings. The methodology will likely generate robust evidence facilitating more effective policymaking to address numerous public health concerns.
Many research groups worldwide are now investigating techniques which can support information retrieval on archives of digital video and as groups move on to implement these techniques they inevitably try to evaluate the performance of their techniques in practical situations. The difficulty with doing this is that there is no test collection or any environment in which the effectiveness of video IR or video IR sub-tasks, can be evaluated and compared. The annual series of TREC exercises has, for over a decade, been benchmarking the effectiveness of systems in carrying out various information retrieval tasks on text and audio and has contributed to a huge improvement in many of these. Two years ago, a track was introduced which covers shot boundary detection, feature extraction and searching through archives of digital video. In this paper we present a summary of the activities in the TREC Video track in 2002 where 17 teams from across the world took part.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.