Automatic modulation classification (AMC) is a key technology of cognitive radio used in non-cooperative communication. Recently, deep learning has been applied to AMC tasks. In this paper, an AMC scheme based on deep learning is proposed, which combines random erasing and attention mechanism to achieve high classification accuracy. Firstly, we propose two data augmentation methods, random erasing at sample level and random erasing at amplitude/phase (AP) channel level. The former replaces training samples with noise information, while the latter replaces AP channel information of training samples with noise information. Erased data segments are randomly stitched to enable training data expansion. Training data of different qualities enables deep learning model to have stronger generalization capability and higher robustness. Then, we propose a single-layer Long Short-Term Memory (LSTM) model based on attention mechanism. In the first part of this model, we propose the signal embedding, which enables the input to contain modulation information more comprehensively and accurately. Then hidden state output by LSTM is input into the attention module, and weighting is applied to the hidden state to help the LSTM model capture the temporal features of modulated signals. Compared to a model without attention mechanism, this model has faster convergence speed and better classification performance. Lastly, we propose a random erasing-based test time augmentation (RE-TTA) method. Test data is randomly erased for multiple times and classification results are comprehensively evaluated, in order to further improve classification accuracy. Experimental results on dataset RML2016.10a show that classification accuracy of the proposed scheme is competitive compared with the state-of-the-art methods.