Image recognition is an important research direction in human-computer interaction, which has broad development prospects, aiming at enabling computers to understand and interpret image content. Early image recognition methods mainly rely on hand-designed feature extraction algorithms for image analysis and classification. However, this method has significant limitations, which may not provide accurate recognition results for complex images and is hard to adapt to different application scenarios. With the advancement of deep learning and increased processing power in recent years, Convolutional neural network (CNN) -based image recognition methods have achieved remarkable achievements in human-computer interaction. Therefore, based on the image recognition of CNN in human-computer interaction, this paper first studies the CNN model in depth then describes the application of CNN in human-computer interaction, then enumerates different design methods for comparative analysis, and finally summarizes the advantages and disadvantages of CNN in application and proposes improvements to existing problems. The research shows that image recognition based on CNN is better than the traditional network model and has higher accuracy, but it still has some disadvantages. To address these issues and produce more effective and precise image understanding and interaction, it is required to research and enhance the model's structure and algorithm. The appearance and development of CNN have greatly promoted the development of image recognition technology, which has been widely used in human-computer interaction and has made great breakthroughs.