Convolutional Neural Network (CNNs) models’ size reduction has recently gained interest due to several advantages: energy cost reduction, embedded devices, and multi-core interfaces. One possible way to achieve model reduction is the usage of Rotation-invariant Convolutional Neural Networks because of the possibility of avoiding data augmentation techniques. In this work, we present the next step to obtain a general solution to endowing CNN architectures with the capability of classifying rotated objects and predicting the rotation angle without data-augmentation techniques. The principle consists of the concatenation of a representation mapping transforming rotation to translation and a shared weights predictor. This solution has the advantage of admitting different combinations of various basic, existing blocks. We present results obtained using a Gabor-filter bank and a ResNet feature backbone compared to previous other solutions. We also present the possibility to select between parallelizing the network in several threads for energy-aware High Performance Computing (HPC) applications or reducing the memory footprint for embedded systems. We obtain a competitive error rate on classifying rotated MNIST and outperform existing state-of-the-art results on CIFAR-10 when trained on up-right examples and validated on random orientations.
In classification tasks, the robustness against various image transformations remains a crucial property of the Convolutional Neural Networks (CNNs). It can be acquired using the data augmentation. It comes, however, at the price of the risk of overfitting and a considerable increase in training time. Consequently, other ways to endow CNN with invariance to various transformations-and mainly to the rotations-is an intensive field of study. This paper presents a new reduced rotation invariant classification model composed of two parts: a feature representation mapping and a classifier. We provide an insight into the principle and we prove that the proposed model is trainable. This model is smaller in terms of trainable parameters than similar approaches, and has angular prediction capabilities. We illustrate the results on the MNIST and CIFAR-10 datasets. On MNIST, we i) achieve the state of the art of classification on MNIST-rot (with training on MNIST-rot), and ii) improve the results of classification on MNIST-rot (with training on upright MNIST). When trained on CIFAR-10 with upright samples and tested with rotated samples we improve by 20% the state of the art classification results. In all cases, we can predict the rotation angle.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.