A network model based on the self-attention mechanism is proposed to address the difficulties in extracting features from ghost imaging targets, low recognition efficiency, and potential errors. First, a ghost imaging detection system is constructed using a laser, spatial light modulator, bucket detector, etc. The object is illuminated with speckles generated by the spatial light modulator. The detected data are then input into the self-attention mechanism network model for training. Experimental results show that for the handwritten digits in the experimental dataset, the highest accuracy and average accuracy of the self-attention mechanism network are 99.13% and 96.41%, respectively. This experiment demonstrates the potential of using the self-attention mechanism network for target recognition in ghost imaging, improving the speed of target recognition and significantly enhancing the accuracy of recognition.