Detecting distracted driving accurately and quickly with limited resources is an essential yet underexplored problem. Most of the existing works ignore the resource-limited reality. In this work, we aim to achieve accurate and fast distracted driver detection in the context of embedded devices where only limited memory and computing resources are available. Specifically, we propose a novel convolutional neural network (CNN) light-weighting method via adjusting block layers and shrinking network channels without compromising the model’s accuracy. Finally, the model is deployed on multiple devices with real-time detection of driving behaviour. The experimental results for the American University in Cairo (AUC) and StateFarm datasets demonstrate the effectiveness of the proposed method. For instance, for the AUC dataset, the proposed MobileNetV2-tiny model achieves 1.63% higher accuracy with just 78% of the model parameters of the original MobileNetV2 model. The inference speed of the proposed MobileNetV2-tiny model on resource-limited devices is on average 1.5 times that of the original MobileNetV2 model, which can meet real-time requirements.