In recent years, the use of convolutional neural networks for video processing has become very attractive. The reason lies in the computational power for data processing which is available today. There are many well-defined research areas where neural networks have brought higher reliability than other conventional approaches; for example, traffic sign recognition and isolated number recognition. In this paper, we will describe the architecture and the implementation of the process of soccer game annotation. The game is annotated with data about players. The technology of convolutional neural networks is used for number recognition. The process runs in real-time on a streaming video. Content enriched with metadata is given to the user in parallel with the real-time video. In the paper, we will describe in some detail the following modules: Image binarization, shot localization, the selection and recognition of numbers on players` jerseys.