Water turbidity stands as a pivotal water quality indicator for water quality monitoring, but conventional contact monitoring equipment is expensive and challenging to maintain. The development of a cost-effective, noncontact turbidity measurement method remains paramount for scientific water quality monitoring. In this study, we propose a video image-based model for water turbidity recognition incorporating image regression algorithms. Residual neural networks (ResNet) and transfer learning are employed to enhance model robustness, and the balanced mean squared error (BMSE) loss function is utilized to address the impact of imbalanced training data. An image preclassification module is implemented to mitigate the influence of changing lighting conditions on turbidity recognition. The effectiveness of these methods is validated using data from the United States Geological Survey (USGS). Our findings reveal that the BMSE loss function outperforms the mean squared error (MSE) loss function in turbidity recognition. Global image recognition consistently surpasses region of interest (ROI) image recognition, providing more accurate and reliable results under various lighting scenarios. Additionally, the influence of neural network depth on recognition accuracy varies with lighting conditions, highlighting the need for a careful selection of network architectures. Despite the promising results, the study identifies challenges in continuous noncontact monitoring, especially during sunrise and sunset, where rapidly changing lighting conditions can lead to recognition errors. The video image-based water turbidity identification model established in the study provides a new way to carry out low-cost and noncontact monitoring of water quality in rivers.