In the diverse realms of computer vision, psychology, biometrics, medicine, and robotics, the accurate estimation of pupil size and position holds paramount importance for applications like eye tracking, medical diagnostics, and facial recognition. Traditional pupil estimation techniques often grapple with speed and error issues, impeding their applicability in real-world scenarios. To address this challenge, our study introduces an innovative approach that significantly enhances both the speed and accuracy of pupil estimation. This method hinges on the fine-tuning of a pre-trained semantic segmentation model integrated with a shallow convolutional neural network (CNN) backbone. Our methodology employs a dual-phase process: initially leveraging a robust pre-trained semantic segmentation model, subsequently refined through targeted fine-tuning using a diverse collection of eye images. This process intricately learns pupil characteristics, substantially elevating detection precision. The incorporation of a shallow CNN backbone streamlines the model, ensuring rapid processing suitable for real-time applications. The novelty of our approach lies in its adept handling of varying lighting and camera conditions, establishing new benchmarks in both speed and accuracy, as evidenced by our experimental findings. This advancement marks a significant leap in pupil estimation technology, offering a practical, efficient solution with far-reaching implications in several key technological domains. Doi: 10.28991/HIJ-2024-05-02-016 Full Text: PDF