“…Having acquired information for each frame, it is relatively easy to translate it at the shot level, by employing the average face ratio from all the frames of the shot. In order to avoid having the shot results affected by a possibly very large (or a very small) BB, we first sort the vector that contains all the BB sizes for all the frames, and then select the median value, as in [1] [11]. Having extracted the face-frame ratios, we could use this information to extract the shot-type as in [7].…”