As a safety-related application, visual systems based on deep neural networks (DNNs) in modern unmanned aerial vehicles (UAVs) show adversarial vulnerability when performing real-time inference. Recently, deep ensembles with various defensive strategies against adversarial samples have drawn much attention due to the increased diversity and reduced variance for their members. Aimed at the recognition task of remote sensing images (RSIs), this paper proposes to use a reactive-proactive ensemble defense framework to solve the security problem. In reactive defense, we fuse scoring functions of several classical detection algorithms with the hidden features and average output confidences from sub-models as a second fusion. In terms of proactive defense, we attempt two strategies, including enhancing the robustness of each sub-model and limiting the transferability among sub-models. In practical applications, the real-time RSIs are first input to the reactive defense part, which can detect and reject the adversarial RSIs. The accepted ones are then passed to robust recognition with a proactive defense. We conduct extensive experiments on three benchmark RSI datasets (i.e., UCM, AID, and FGSC-23). The experimental results show that the deep ensemble method of reactive and proactive defense performs very well in gradient-based attacks. The analysis of the applicable attack scenarios for each proactive ensemble defense is also helpful for this field. We also perform a case study with the whole framework in the black-box scenario, and the highest detection rate reaches 93.25%. Most of the adversarial RSIs can be rejected in advance or correctly recognized by the enhanced deep ensemble. This article is the first one to combine reactive and proactive defenses with a deep ensemble against adversarial attacks in the context of RSI recognition for DNN-based UAVs.