Determination of ovulation time is one of the most important tasks in sow reproduction management. Temperature variation in the vulva of the sows can be used as a predictor of ovulation time. However, the skin temperatures of sows in existing studies are obtained manually from infrared thermal images, posing an obstacle to the automatic prediction of ovulation time. In this study, an improved YOLO-V5s detector based on feature fusion and dilated convolution (FD-YOLOV5s) was proposed for the automatic extraction of the vulva temperature of sows based on infrared thermal images. For the purpose of reducing the model complexity, the depthwise separable convolution and the modified lightweight ShuffleNet-V2 module were introduced in the backbone. Meanwhile, the feature fusion network structure of the model was simplified for efficiency, and a mixed dilated convolutional module was designed to obtain global features. The experimental results show that FD-YOLOV5s outperformed the other nine methods, with a mean average precision (mAP) of 99.1%, an average frame rate of 156.25 fps, and a model size of only 3.86 MB, indicating that the method effectively simplifies the model while ensuring detection accuracy. Using a linear regression between manual extraction and the results extracted using this method in randomly selected thermal images, the coefficients of determination for maximum and average vulvar temperatures reached 99.5% and 99.3%, respectively. The continuous vulva temperature of sows was obtained by the target detection algorithm, and the sow estrus detection was performed by the temperature trend and compared with the manually detected estrus results. The results showed that the sensitivity, specificity, and error rate of the estrus detection algorithm were 89.3%, 94.5%, and 5.8%, respectively. The method achieves real-time and accurate extraction of sow vulva temperature and can be used for the automatic detection of sow estrus, which could be helpful for the automatic prediction of ovulation time.