Computerized lip reading is the science of translating visemes and oral lip reading into written text, where visemes are lip movement without sound. Video processing is applied for the recognition of those visemes. Previous research developed automated systems to for computerized lip reading recognition to be hearing-impaired aid. Many challenges face such an automation process, including insufficient training datasets. Also, speaker-dependency is one of the challenges that are faced. Real-time applications which respond within a specific time period are also widely required. Real-time human computer interaction are systems which require real time response. Response time for human computer interaction is measured by number of elapsed video frames. Video processing of lip reading necessitates real-time implementation. There are applications for viseme recognition, as an aid for deaf people, video games with human computer interaction, and surveillance systems. In this paper, a real-time viseme recognition system is introduced. In order to enhance existing methods to overcome these pitfalls, this paper proposed a computerized lip reading technique based on feature extraction. We utilized blocks arrangement techniques to reach a near-optimal appearance feature extraction technique. A Deep Neural Network is utilized to enhance recognition. The benchmark dataset SAVE, for Arabic visemes, is employed in this research, and high viseme recognition accuracies are achieved. The described computerized lip reading recognition technique is advantageous for the hearing impaired and for other speakers in noisy environments.