Image perception of tourism destinations plays an important role in destination marketing and management. Considering that long-distance feature information of travel review text is difficult to capture and local key information is ignored, BiLSTM and CNN are improved to propose a travel text classification method based on BERT-BiLSTM-CNN-Attention hybrid neural network model. Taking Sanya City as the research object, we adopt the emotion classification and content analysis methods and construct the tourism destination image perception analysis framework based on the “cognitive-emotional” three-dimensional model, providing a research perspective for the sustainable development of tourism in Sanya City. The results show that the accuracy of the proposed model reaches 93.18%, which is better than other models. Tourists’ perception of destination image includes four aspects: tourism resources, tourism environment perception, tourism infrastructure and supporting services, as well as tourism activities. Positive emotions dominate emotional image, and negative emotions are mainly focused on tourism infrastructure and supporting services. On the overall image perception, tourists have a high evaluation of the tourism image of Sanya City. This research has some implications for tourism destinations, such as improving their management programs, enhancing their marketing strategies, and achieving long-term sustainable development of their destinations.