The size of one’s pupil can indicate one’s physical condition and mental state. When we search related papers about AI and the pupil, most studies focused on eye-tracking. This paper proposes an algorithm that can calculate pupil size based on a convolution neural network (CNN). Usually, the shape of the pupil is not round, and 50% of pupils can be calculated using ellipses as the best fitting shapes. This paper uses the major and minor axes of an ellipse to represent the size of pupils and uses the two parameters as the output of the network. Regarding the input of the network, the dataset is in video format (continuous frames). Taking each frame from the videos and using these to train the CNN model may cause overfitting since the images are too similar. This study used data augmentation and calculated the structural similarity to ensure that the images had a certain degree of difference to avoid this problem. For optimizing the network structure, this study compared the mean error with changes in the depth of the network and the field of view (FOV) of the convolution filter. The result shows that both deepening the network and widening the FOV of the convolution filter can reduce the mean error. According to the results, the mean error of the pupil length is 5.437% and the pupil area is 10.57%. It can operate in low-cost mobile embedded systems at 35 frames per second, demonstrating that low-cost designs can be used for pupil size prediction.