The detection of obstacles at rail level crossings (RLC) is an important task for ensuring the safety of train traffic. Traffic control systems require reliable sensors for determining the state of anRLC. Fusion of information from a number of sensors located at the site increases the capability for reacting to dangerous situations. One such source is video from monitoring cameras. This paper presents a method for processing video data, using deep learning, for the determination of the state of the area (region of interest—ROI) vital for a safe passage of the train. The proposed approach is validated using video surveillance material from a number of RLC sites in Poland. The films include 24/7 observations in all weather conditions and in all seasons of the year. Results show that the recall values reach 0.98 using significantly reduced processing resources. The solution can be used as an auxiliary source of signals for train control systems, together with other sensor data, and the fused dataset can meet railway safety standards.