Joint vehicle localization and categorization in high resolution aerial images can provide useful information for applications such as traffic flow structure analysis. To maintain sufficient features to recognize small-scaled vehicles, a regions with convolutional neural network features (R-CNN) -like detection structure is employed. In this setting, cascaded localization error can be averted by equally treating the negatives and differently typed positives as a multi-class classification task, but the problem of class-imbalance remains. To address this issue, a cost-effective network extension scheme is proposed. In it, the correlated convolution and connection costs during extension are reduced by feature map selection and bi-partite main-side network construction, which are realized with the assistance of a novel feature map class-importance measurement and a new class-imbalance sensitive main-side loss function. By using an image classification dataset established from a set of traditional real-colored aerial images with 0.13 m ground sampling distance which are taken from the height of 1000 m by an imaging system composed of non-metric cameras, the effectiveness of the proposed network extension is verified by comparing with its similarly shaped strong counter-parts. Experiments show an equivalent or better performance, while requiring the least parameter and memory overheads are required.
Scene classification is one of the fundamental techniques shared by many basic remote sensing tasks with a wide range of applications. As the demands of catering with situations under high variance in the data urgent conditions are rising, a research topic called few-shot scene classification is receiving more interest with a focus on building classification model from few training samples. Currently, methods using the meta-learning principle or graphical models are achieving state-of-art performances. However, there are still significant gaps in between the few-shot methods and the traditionally trained ones, as there are implicit data isolations in standard meta-learning procedure and less-flexibility in the static graph neural network modeling technique, which largely limit the data-to-knowledge transition efficiency. To address these issues, this paper proposed an novel few-shot scene classification algorithm based on a different meta-learning principle called continual meta-learning, which enhances the inter-task correlation by fusing more historical prior knowledge from a sequence of tasks within sections of meta-training or meta-testing periods. Moreover, as to increase the discriminative power between classes, a graph transformer is introduced to produce the structural attention, which can optimize the distribution of sample features in the embedded space and promotes the overall classification capability of the model. The advantages of our proposed algorithm are verified by comparing with nine state-of-art meta-learning based on few-shot scene classification on three popular datasets, where a minimum of a 9% increase in accuracy can be observed. Furthermore, the efficiency of the newly added modular modifications have also be verified by comparing to the continual meta-learning baseline.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.