“…These tags help the coherence between the intended narrative action and the actual content being produced, and the mood of the characters is also important to convey the intended narrative action. Both tasks can be performed with satisfying precision by current video processing technologies, i.e., character recognition (for example, relying on face recognition, as we explored in [27]) and facial expression to extract the mood of the characters [34]. However, since the considered shot is being repurposed to represent a part of the narrative action that is, in general, different from the one represented in the original scene from which it is taken, it is not necessary to describe, in depth, the characters' emotions in the shot.…”