Nowadays, automatic modulation classification (AMC) has become a key component of next-generation drone communication systems, which are crucial for improving communication efficiency in non-cooperative environments. The contradiction between the accuracy and efficiency of current methods hinders the practical application of AMC in drone communication systems. In this paper, we propose a real-time AMC method based on the lightweight mobile radio transformer (MobileRaT). The constructed radio transformer is trained iteratively, accompanied by pruning redundant weights based on information entropy, so it can learn robust modulation knowledge from multimodal signal representations for the AMC task. To the best of our knowledge, this is the first attempt in which the pruning technique and a lightweight transformer model are integrated and applied to processing temporal signals, ensuring AMC accuracy while also improving its inference efficiency. Finally, the experimental results—by comparing MobileRaT with a series of state-of-the-art methods based on two public datasets—have verified its superiority. Two models, MobileRaT-A and MobileRaT-B, were used to process RadioML 2018.01A and RadioML 2016.10A to achieve average AMC accuracies of 65.9% and 62.3% and the highest AMC accuracies of 98.4% and 99.2% at +18 dB and +14 dB, respectively. Ablation studies were conducted to demonstrate the robustness of MobileRaT to hyper-parameters and signal representations. All the experimental results indicate the adaptability of MobileRaT to communication conditions and that MobileRaT can be deployed on the receivers of drones to achieve air-to-air and air-to-ground cognitive communication in less demanding communication scenarios.