Nowadays, driving accidents are considered one of the most crucial challenges for governments and communities that affect transportation systems and peoples lives. Unfortunately, there are many causes behind the accidents; however, drowsiness is one of the main factors that leads to a significant number of injuries and deaths. In order to reduce its effect, researchers and communities have proposed many techniques for detecting drowsiness situations and alerting the driver before an accident occurs. Mostly, the proposed solutions are visually-based, where a camera is positioned in front of the driver to detect their facial behavior and then determine their situation, e.g., drowsy or awake. However, most of the proposed solutions make a trade-off between detection accuracy and speed. In this paper, we propose a novel Visual-based Alerting System for Detecting Drowsy Drivers (VAS-3D) that ensures an optimal trade-off between the accuracy and speed metrics. Mainly, VAS-3D consists of two stages: detection and classification. In the detection stage, we use pre-trained Haar cascade models to detect the face and eyes of the driver. Once the driver’s eyes are detected, the classification stage uses several pre-trained Convolutional Neural Network (CNN) models to classify the driver’s eyes as either open or closed, and consequently their corresponding situation, either awake or drowsy. Subsequently, we tested and compared the performance of several CNN models, such as InceptionV3, MobileNetV2, NASNetMobile, and ResNet50V2. We demonstrated the performance of VAS-3D through simulations on real drowsiness datasets and experiments on real world scenarios based on real video streaming. The obtained results show that VAS-3D can enhance the accuracy detection of drowsy drivers by at least 7.5% (the best accuracy reached was 95.5%) and the detection speed by up to 57% (average of 0.25 ms per frame) compared to other existing models.