2020
DOI: 10.47476/jat.v3i2.2020.138
|View full text |Cite
|
Sign up to set email alerts
|

Taking a Cue From the Human

Abstract: Human beings find the process of narrative sequencing in written texts and moving imagery a relatively simple task. Key to the success of this activity is establishing coherence by using critical cues to identify key characters, objects, actions and locations as they contribute to plot development. In the drive to make audiovisual media more widely accessible (through audio description), and media archives more searchable (through content description), computer vision experts strive to automate video cap… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
0
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
3
1
1

Relationship

1
4

Authors

Journals

citations
Cited by 5 publications
(8 citation statements)
references
References 32 publications
0
0
0
Order By: Relevance
“…Results of the present study could, for instance, make audio describers aware of the importance of verbalising event changes in relation to the spatiotemporal context and offer examples and solutions as to how this can be done. Event segmentation plays also an important role for automated, computer-generated video description (Braun et al, 2020;Starr et al, 2020). It is necessary to teach the algorithms to identify actions in dynamic scenes, to "see" connections between frames and actions, and to recognise event boundaries.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…Results of the present study could, for instance, make audio describers aware of the importance of verbalising event changes in relation to the spatiotemporal context and offer examples and solutions as to how this can be done. Event segmentation plays also an important role for automated, computer-generated video description (Braun et al, 2020;Starr et al, 2020). It is necessary to teach the algorithms to identify actions in dynamic scenes, to "see" connections between frames and actions, and to recognise event boundaries.…”
Section: Discussionmentioning
confidence: 99%
“…Event segmentation is also important for computer-generated video description, a new promising area that is being rapidly developed to supplement human audio description and make audio-visual media more widely accessible. However, as the comparisons of human and automatic scene descriptions show, there are fundamental problems with the current state and quality of machinegenerated descriptions (Braun et al, 2020;Starr et al, 2020). Among other issues, computer algorithms "see" images in isolation as single frames, do not integrate them, and are likely to miss several key actions (or mis-label them).…”
Section: Introductionmentioning
confidence: 99%
“…The process of event segmentation is equally essential for automated computer-generated video description (Braun, Starr, & Laaksonen, 2020;Starr, Braun, & Delfani, 2020;Starr & Braun, 2023). content on an equal footing with their sighted counterparts.…”
Section: Event Boundary Perception In Ad Filmsmentioning
confidence: 99%
“…Consequently, a central inquiry in the context of successful AD pertains to how the sequence of events unfolding in a film aligns with the verbal narration conveyed through AD. Event segmentation also holds significance in the domain of computer-generated video description, an emerging field aimed at augmenting human-generated AD to make audio-visual media more widely accessible (Braun, Starr & Laaksonen, 2020;Starr, Braun & Delfani, 2020).…”
mentioning
confidence: 99%
“…In the European MeMAD project (grant no. 780069), our primary focus has been on developing semi-automated video description models which replicate, as far as possible, the work of human describers of audiovisual content [1,2]. This has been achieved using computer vision modelling, theories of human engagement with multimodal narrative, and the integration of machine-generated data within an editing platform, Flow, which draws together machine descriptions, named-entity recognition, metadata, transcriptions and translation services (Fig.…”
Section: Study Aims and Structurementioning
confidence: 99%