Lip reading is a method to understand speech through the movement of the lips, as audio speech is notinclusive of all Categories of society, especially the hearing impaired or people in noisy environments.Lip reading is the best and alternative solution to this problem. Our proposed system solves this problemby taking a video of the person speaking with digits. Then the pre-processing process is carried out byViola Jones algorithm, by cutting the video into a sequential frame, then detecting the face, then themouth, deducting the mouth region of interest(ROI), and inserting the mouth frame into the convolutionalneural network (ResNet50), where the results are classified and the test frames is matched with thetraining frames if it is done Matching, the network is working correctly and the correct digit is spoken.But if the test frame is not matched with the training framework, then there is an error rate in thenetwork’s work and there is an error rate in the network. For that, we used a standard database topronounce the digits from 0 to 9, and we took seven speaking people, 5 males and 2 females, and we gotan accuracy of 86%.