A Universal Protocol to Benchmark Camera Calibration for Sports

Magera, Floriane; Hoyoux, Thomas; Barnich, Olivier; Van Droogenbroeck, Marc

doi:10.1109/cvprw63382.2024.00338

2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) 2024

DOI: 10.1109/cvprw63382.2024.00338

|View full text |Cite

A Universal Protocol to Benchmark Camera Calibration for Sports

Floriane Magera,

Thomas Hoyoux,

Olivier Barnich

et al.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

Supporting

Mentioning

Contrasting

Year Published

2024

Publication Types

Select...

Other3

Relationship

Self Cite0

Independent3

Authors

Journals

Cited by 3 publications

References 47 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

X-VARS: Introducing Explainability in Football Refereeing with Multi-Modal Large Language Models

Held,

Itani,

Cioppa

et al. 2024

2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)

View full text Add to dashboard Cite

The rapid advancement of artificial intelligence has led to significant improvements in automated decision-making. However, the increased performance of models often comes at the cost of explainability and transparency of their decision-making processes. In this paper, we investigate the capabilities of large language models to explain decisions, using football refereeing as a testing ground, given its decision complexity and subjectivity. We introduce the EXplainable Video Assistant Referee System, X-VARS, a multi-modal large language model designed for understanding football videos from the point of view of a referee. X-VARS can perform a multitude of tasks, including video description, question answering, action recognition, and conducting meaningful conversations based on video content and in accordance with the Laws of the Game for football referees. We validate X-VARS on our novel dataset, SoccerNet-XFoul, which consists of more than 22k videoquestion-answer triplets annotated by over 70 experienced football referees. Our experiments and human study illustrate the impressive capabilities of X-VARS in interpreting complex football clips. Furthermore, we highlight the potential of X-VARS to reach human performance and support football referees in the future. We will provide code, model, dataset, and demo upon publication.

show abstract

X-VARS: Introducing Explainability in Football Refereeing with Multi-Modal Large Language Models

Held,

Itani,

Cioppa

et al. 2024

2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)

View full text Add to dashboard Cite

show abstract

SoccerNet-Depth: a Scalable Dataset for Monocular Depth Estimation in Sports Videos

Leduc,

Cioppa,

Giancola

et al. 2024

2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)

View full text Add to dashboard Cite

Monocular Depth Estimation (MDE) is fundamental in sports video understanding, enhancing augmented graphics, scene understanding, and game state reconstruction. Despite remarkable progress in autonomous driving and indoor scene understanding, there is currently a lack of MDE datasets tailored for sports. Furthermore, most existing datasets only focus on single images, disregarding the temporal aspect. In this work, we introduce the first video dataset for MDE in sports, SoccerNet-Depth, focusing on football and basketball videos. In particular, we leverage the graphic engine from video games to automatically extract video sequences and their associated depth maps, making our dataset easily scalable. Furthermore, we benchmark and fine-tune several state-of-the-art MDE methods on our dataset. Our analysis shows that MDE in sports is far from being solved, making our dataset a perfect playground for future research. Dataset and codes: https://github.com/SoccerNet/sn-depth.

show abstract

Enhancing Soccer Camera Calibration Through Keypoint Exploitation

Falaleev,

Chen

2024

Proceedings of the 7th ACM International Workshop on Multimedia Content Analysis in Sports

View full text Add to dashboard Cite

Accurate camera calibration is essential for transforming 2D images from camera sensors into 3D world coordinates, enabling precise scene geometry interpretation and supporting sports analytics tasks such as player tracking, offside detection, and performance analysis. However, obtaining a sufficient number of high-quality point pairs remains a significant challenge for both traditional and deep learning-based calibration methods. This paper introduces a multi-stage pipeline that addresses this challenge by leveraging the structural features of the football pitch. Our approach significantly increases the number of usable points for calibration by exploiting line-line and line-conic intersections, points on the conics, and other geometric features. To mitigate the impact of imperfect annotations, we employ data fitting techniques. Our pipeline utilizes deep learning for keypoint and line detection and incorporates geometric constraints based on real-world pitch dimensions. A voter algorithm iteratively selects the most reliable keypoints, further enhancing calibration accuracy. We evaluated our approach on the largest football broadcast camera calibration dataset available, and secured the top position in the SoccerNet Camera Calibration Challenge 2023 [9], which demonstrates the effectiveness of our This work is licensed under a Creative Commons Attribution-NonCommercial International 4.0 License.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

A Universal Protocol to Benchmark Camera Calibration for Sports

Cited by 3 publications

References 47 publications

X-VARS: Introducing Explainability in Football Refereeing with Multi-Modal Large Language Models

X-VARS: Introducing Explainability in Football Refereeing with Multi-Modal Large Language Models

SoccerNet-Depth: a Scalable Dataset for Monocular Depth Estimation in Sports Videos

Enhancing Soccer Camera Calibration Through Keypoint Exploitation

Contact Info

Product

Resources

About