A Deep Architecture for Multimodal Summarization of Soccer Games

Sanabria, Melissa; Sherly, Sherly; Precioso, Fŕed́eric; Menguy, Thomas

doi:10.1145/3347318.3355524

Cited by 31 publications

(26 citation statements)

References 30 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…We follow the event and bag representation proposed in [48]. A match is a sequence of events {e 1 , e 2 , ..., e N } which are all the events occurring on the field (possibly not broadcasted on TV).…”

Section: Proposal Generationmentioning

confidence: 99%

“…In order to rely on the importance of each event to select or not an action in the final summary, we need to define a score per event instead of per bag, merging all the possible scores associated to the given event. The score S en for the event e n is then given by the Log-Sum-Exp (LSE) used in [48], the function is defined in Eq. (4).…”

Section: B Proposal Definitionmentioning

confidence: 99%

“…Fig. 2c) with two state-of-the-art methods, Sanabria et al [48] and Hori et al [50] (see Fig. 2b), and a baseline (see Fig.…”

Section: Summarizationmentioning

confidence: 99%

“…We consider that it is also important to show if the use of additional audio features improves the results. Sanabria et al [48] only used the energy of the audio signal, however as it was mentioned in section IV-A, there are many other audio features that have helped to improve classifications in other contexts. We created two different models that take as input either the energy features proposed by Sanabria et al or the audio features proposed in this paper (details are in the supplementary material), in order to predict which action belong to the summary.…”

Section: Summarizationmentioning

confidence: 99%

See 3 more Smart Citations

Hierarchical Multimodal Attention for Deep Video Summarization

Sanabria

Precioso

Menguy³

2021

2020 25th International Conference on Pattern Recognition (ICPR)

Self Cite

View full text Add to dashboard Cite

The way people consume sports on TV has drastically evolved in the last years, particularly under the combined effects of the legalization of sport betting and the huge increase of sport analytics. Several companies are nowadays sending observers in the stadiums to collect live data of all the events happening on the field during the match. Those data contain meaningful information providing a very detailed description of all the actions occurring during the match to feed the coaches and staff, the fans, the viewers, and the gamblers. Exploiting all these data, sport broadcasters want to generate extra content such as match highlights, match summaries, players and teams analytics, etc., to appeal subscribers. This paper explores the problem of summarizing professional soccer matches as automatically as possible using both the aforementioned event-stream data collected from the field and the content broadcasted on TV. We have designed an architecture, introducing first (1) a Multiple Instance Learning method that takes into account the sequential dependency among events and then (2) a hierarchical multimodal attention layer that grasps the importance of each event in an action. We evaluate our approach on matches from two professional European soccer leagues, showing its capability to identify the best actions for automatic summarization by comparing with real summaries made by human operators.

show abstract

Section: Proposal Generationmentioning

confidence: 99%

Section: B Proposal Definitionmentioning

confidence: 99%

“…Fig. 2c) with two state-of-the-art methods, Sanabria et al [48] and Hori et al [50] (see Fig. 2b), and a baseline (see Fig.…”

Section: Summarizationmentioning

confidence: 99%

Section: Summarizationmentioning

confidence: 99%

See 2 more Smart Citations

Hierarchical Multimodal Attention for Deep Video Summarization

Sanabria

Precioso

Menguy³

2021

2020 25th International Conference on Pattern Recognition (ICPR)

Self Cite

View full text Add to dashboard Cite

show abstract

“…Computer vision methods have been developed to help understand sport broadcasts, carry out analytics within a game [12,20,66], or even assist in broadcast production. Interesting use cases innclude the automatic summarization of games [21,56,69], the identification of salient game actions [23,45,76] or the reporting of commentaries of live game video streams [78].…”

Section: Related Workmentioning

confidence: 99%

Improved Soccer Action Spotting using both Audio and Video Streams

Vanderplaetse

Dupont

2020

2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)

View full text Add to dashboard Cite

In this paper, we propose a study on multi-modal (audio and video) action spotting and classification in soccer videos. Action spotting and classification are the tasks that consist in finding the temporal anchors of events in a video and determine which event they are. This is an important application of general activity understanding. Here, we propose an experimental study on combining audio and video information at different stages of deep neural network architectures. We used the SoccerNet benchmark dataset, which contains annotated events for 500 soccer game videos from the Big Five European leagues. Through this work, we evaluated several ways to integrate audio stream into video-only-based architectures. We observed an average absolute improvement of the mean Average Precision (mAP) metric of 7.43% for the action classification task and of 4.19% for the action spotting task.

show abstract

Multimodal Learning for Automatic Summarization: A Survey

Zhang,

Sun,

2023

Lecture Notes in Computer Science

View full text Add to dashboard Cite

A Deep Architecture for Multimodal Summarization of Soccer Games

Cited by 31 publications

References 30 publications

Hierarchical Multimodal Attention for Deep Video Summarization

Hierarchical Multimodal Attention for Deep Video Summarization

Improved Soccer Action Spotting using both Audio and Video Streams

Multimodal Learning for Automatic Summarization: A Survey

Contact Info

Product

Resources

About