In recent years, cinematography and other digital content creators have been eagerly turning to Three-Dimensional (3D) imaging technology. The creators of movies, games, and augmented reality applications are aware of this technology’s advantages, possibilities, and new means of expression. The development of electronic and IT technologies enables the achievement of a better and better quality of the recorded 3D image and many possibilities for its correction and modification in post-production. However, preparing a correct 3D image that does not cause perception problems for the viewer is still a complex and demanding task. Therefore, planning and then ensuring the correct parameters and quality of the recorded 3D video is essential. Despite better post-production techniques, fixing errors in a captured image can be difficult, time consuming, and sometimes impossible. The detection of errors typical for stereo vision related to the depth of the image (e.g., depth budget violation, stereoscopic window violation) during the recording allows for their correction already on the film set, e.g., by different scene layouts and/or different camera configurations. The paper presents a prototype of an independent, non-invasive diagnostic system that supports the film crew in the process of calibrating stereoscopic cameras, as well as analysing the 3D depth while working on a film set. The system acquires full HD video streams from professional cameras using Serial Digital Interface (SDI), synchronises them, and estimates and analyses the disparity map. Objective depth analysis using computer tools while recording scenes allows stereographers to immediately spot errors in the 3D image, primarily related to the violation of the viewing comfort zone. The paper also describes an efficient method of analysing a 3D video using Graphics Processing Unit (GPU). The main steps of the proposed solution are uncalibrated rectification and disparity map estimation. The algorithms selected and implemented for the needs of this system do not require knowledge of intrinsic and extrinsic camera parameters. Thus, they can be used in non-cooperative environments, such as a film set, where the camera configuration often changes. Both of them are implemented with the use of a GPU to improve the data processing efficiency. The paper presents the evaluation results of the algorithms’ accuracy, as well as the comparison of the performance of two implementations—with and without the GPU acceleration. The application of the described GPU-based method makes the system efficient and easy to use. The system can process a video stream with full HD resolution at a speed of several frames per second.