Procedings of the British Machine Vision Conference 2009 2009
DOI: 10.5244/c.23.121
|View full text |Cite
|
Sign up to set email alerts
|

Subtitle-free Movie to Script Alignment

Abstract: A standard solution for aligning scripts to movies is to use dynamic time warping with the subtitles (Everingham et al., BMVC 2006). We investigate the problem of aligning scripts to TV video/movies in cases where subtitles are not available, e.g. in the case of silent films or for film passages which are non-verbal. To this end we identify a number of "modes of alignment" and train classifiers for each of these. The modes include visual features, such as locations and face recognition, and audio features such… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
14
0

Year Published

2013
2013
2024
2024

Publication Types

Select...
4
4
1

Relationship

0
9

Authors

Journals

citations
Cited by 31 publications
(14 citation statements)
references
References 12 publications
0
14
0
Order By: Relevance
“…There are some scenarios where only transcripts are used as the form of supervision. [22] proposes a method to align scripts to videos in the absence of subtitles. [21] jointly model co-reference resolution and character identification using only scripts.…”
Section: Related Workmentioning
confidence: 99%
“…There are some scenarios where only transcripts are used as the form of supervision. [22] proposes a method to align scripts to videos in the absence of subtitles. [21] jointly model co-reference resolution and character identification using only scripts.…”
Section: Related Workmentioning
confidence: 99%
“…When subtitle information is either not available or there are scenes without dialogue, other methods need to be used to provide alignment information. Sankar et al [28] use a combination of location recognition, facial recognition and speech-to-text to align scripts when subtitles are not available. However, the approach requires manual labelling of characters and location information was hard-coded to repetitive use of stock footage.…”
Section: Related Workmentioning
confidence: 99%
“…In the absence of subtitles, Sankar et al [9] attempted to align script and movie using video cues, on-screen text (for silent movies) and speech-to-text transcript. When a wordlevel time accuracy is needed, a script-speech alignment using a speech recognition engine associated with a script-based language model such as described by Hong et al in [10] is best.…”
Section: Introductionmentioning
confidence: 99%