Interactive image segmentation offers useful guidance to users and can be applied in practical settings for production and daily life purposes. Nonetheless, the technology's intrinsic limitations, including complicated interaction methods and high error rates, have impeded its further advancement with the development of computer vision. To address the issue, this study introduces a method to determine the extreme point of center point prediction. The target center serves as the "symmetric center" of the extreme point, which facilitates the search for other extreme points. Furthermore, the Canny algorithm is combined to achieve edge image detection. Moreover, the residual network is enhanced through embedding a pre-activation step, introducing a BatchNorm layer, and adding a pyramid scene parsing network. Finally, the performance of this method is verified by analyzing its Intersection over Union, segmentation accuracy, efficiency, F1 value, and other indicators. The results show that on the Pascalvoc2012 dataset, the segmentation accuracy obtained through the extreme point method can reach over 90%. The addition of the pyramid scene analysis network stabilizes its accuracy on urban landscape datasets between 92% and 96%. When the proposed image segmentation method is applied to the Grabcut dataset, its union intersection can reach 88.7%. On the self-generated complex daily scenery dataset, this method achieves segmentation accuracy of 95% with superior stability and precision. This provides a fresh methodological reference for optimizing interactive image segmentation technology further.