Undeniably, Deep Learning (DL) has rapidly eroded traditional machine learning in Remote Sensing (RS) and geoscience domains with applications such as scene understanding, material identification, extreme weather detection, oil spill identification, among many others. Traditional machine learning algorithms are given less and less attention in the era of big data. Recently, a substantial amount of work aimed at developing image classification approaches based on the DL model’s success in computer vision. The number of relevant articles has nearly doubled every year since 2015. Advances in remote sensing technology, as well as the rapidly expanding volume of publicly available satellite imagery on a worldwide scale, have opened up the possibilities for a wide range of modern applications. However, there are some challenges related to the availability of annotated data, the complex nature of data, and model parameterization, which strongly impact performance. In this article, a comprehensive review of the literature encompassing a broad spectrum of pioneer work in remote sensing image classification is presented including network architectures (vintage Convolutional Neural Network, CNN; Fully Convolutional Networks, FCN; encoder-decoder, recurrent networks; attention models, and generative adversarial models). The characteristics, capabilities, and limitations of current DL models were examined, and potential research directions were discussed.