This paper presents an efficient model for analysing contextual geographic entities in satellite data at a component level. The model utilizes various feature extraction techniques, including Fourier, Entropy, Wavelet, and Gabor, along with vegetation indices such as NDVI, SAVI, and EVI, to capture the multimodal characteristics of the satellite data. Classification operations are performed using a binary cascaded convolutional neural network (CNN), while incremental learning is facilitated through the Q-learning process. The proposed model offers a novel approach by combining different feature extraction methods, allowing for a comprehensive representation of satellite data. It employs the BCCNN, which exhibits high accuracy in identifying specific geographical features like land, forests, structures, and rivers. Additionally, by utilizing Q-learning for incremental learning, the model's accuracy can improve over time as new data becomes available. Evaluation of the model using an augmented cluster of datasets and samples demonstrated its ability to accurately identify contextual geographic entities with 99.5% accuracy, 98.5% precision, and 98.3% recall. This model holds promise for various applications, including environmental monitoring, disaster management, and urban planning.