As part of the audiovisual archive digitization project, which has become a complex field that requires human and material resources, and its automation and optimization have so far represented a center of interest for researchers and media manufacturers, in particular those linked to the integration of artificial intelligence tools in the industry, an elaborate work for the development of an optical character and face recognition model, to digitize the tasks of audiovisual archivist from the manuscript method in automation, from a TV news video. In this article, an approach to develop an example of lower third in Arabic language and facial detection and recognition for news presenter that provide accurate classification results as well as the presentation of different methods and algorithms for Arabic characters. Many studies have been presented in this area, however a satisfactory classification accuracy is yet to be achieved. The comparative state-of-the-art results adopt the latest approaches to study face recognition or OCR, but this model supports both at the same time. it will present the context of realization, the method proposed to extract the texts in the video, using machine learning, about the specificity of the Arabic language, and finally the reasons that govern the decisions taken in the steps of realization. The best results from this approach in real project at the media station was 90.60%. The dataset collected via presenters images and the character dataset via the Pytesseract library.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.