Formalising stories: sequences of events and state changes

Vassiliou, Andrew; Salway, Andrew; Pitt, David

doi:10.1109/icme.2004.1394260

Cited by 4 publications

(3 citation statements)

References 11 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…For example, dialogue detection experiments have been performed using low-level audio and visual features with a maximum classification accuracy of 96% [1]. Alternatively, emotional stages are proposed as a means for segmenting video in [24]. Detecting monologues based on audio-visual information is discussed in [11], where a maximum recall of 0.880 is reported.…”

Section: Introductionmentioning

confidence: 99%

A neural network approach to audio-assisted movie dialogue detection

et al. 2007

View full text Add to dashboard Cite

A novel framework for audio-assisted dialogue detection based on indicator functions and neural networks is investigated. An indicator function defines that an actor is present at a particular time instant. The cross-correlation function of a pair of indicator functions and the magnitude of the corresponding cross-power spectral density are fed as input to neural networks for dialogue detection. Several types of artificial neural networks, including multilayer perceptrons, voted perceptrons, radial basis function networks, support vector machines, and particle swarm optimization-based multilayer perceptrons are tested. Experiments are carried out to validate the feasibility of the aforementioned approach by using ground-truth indicator functions determined by human observers on 6 different movies. A total of 41 dialogue instances and another 20 non-dialogue instances is employed. The average detection accuracy achieved is high, ranging between 84.78%±5.499% and 91.43%±4.239%.

show abstract

Section: Introductionmentioning

confidence: 99%

A neural network approach to audio-assisted movie dialogue detection

et al. 2007

View full text Add to dashboard Cite

show abstract

“…The LSU segmentation is based on the investigation of visual information and its temporal variations in a video sequence. A movie can be modeled as a sequence of states and events organized in space and time by creating a state graph representing the film story [47]. As far as dialogues are concerned, a dialogue scene can be defined as a set of consecutive shots, which contain conversations of people [3,22].…”

Section: Film Syntax Basicsmentioning

confidence: 99%

Movie Analysis with Emphasis to Dialogue and Action Scene Detection

Benetos

Siatras

Kotropoulos

et al. 2008

Multimodal Processing and Interaction

View full text Add to dashboard Cite

“…For example, automatically extracted low-level and mid-level visual features are used to detect different types of scenes, focusing on dialogue sequences [4]. Emotional stages as a means for segmenting video are proposed in [6]. The detection of monologues based on audio-visual information is discussed in [7] where a noticeably high average decision performance is reported.…”

Section: Introductionmentioning

confidence: 99%

A Framework for Dialogue Detection in Movies

Kotti

Kotropoulos

Ziółko

et al. 2006

Multimedia Content Representation, Classification and Security

View full text Add to dashboard Cite

Abstract. In this paper, we investigate a novel framework for dialogue detection that is based on indicator functions. An indicator function defines that a particular actor is present at each time instant. Two dialogue detection rules are developed and assessed. The first rule relies on the value of the cross-correlation function at zero time lag that is compared to a threshold. The second rule is based on the cross-power in a particular frequency band that is also compared to a threshold. Experiments are carried out in order to validate the feasibility of the aforementioned dialogue detection rules by using ground-truth indicator functions determined by human observers from six different movies. A total of 25 dialogue scenes and another 8 non-dialogue scenes are employed. The probabilities of false alarm and detection are estimated by cross-validation, where 70% of the available scenes are used to learn the thresholds employed in the dialogue detection rules and the remaining 30% of the scenes are used for testing. An almost perfect dialogue detection is reported for every distinct threshold.

show abstract

Formalising stories: sequences of events and state changes

Cited by 4 publications

References 11 publications

A neural network approach to audio-assisted movie dialogue detection

A neural network approach to audio-assisted movie dialogue detection

Movie Analysis with Emphasis to Dialogue and Action Scene Detection

A Framework for Dialogue Detection in Movies

Contact Info

Product

Resources

About