With the emergence of smartphones, video surveillance cameras, social networks, and multimedia engines, as well as the development of the internet and connected objects (the Internet of Things—IoT), the number of available images is increasing very quickly. This leads to the necessity of managing a huge amount of data using Big Data technologies. In this context, several sectors, such as security and medicine, need to extract image features (index) in order to quickly and efficiently find these data with high precision. To reach this first goal, two main approaches exist in the literature. The first one uses classical methods based on the extraction of visual features, such as color, texture, and shape for indexation. The accuracy of these methods was acceptable until the early 2010s. The second approach is based on convolutional neuronal networks (CNN), which offer better precision due to the largeness of the descriptors, but they can cause an increase in research time and storage space. To decrease the research time, one needs to reduce the size of these vectors (descriptors) by using dimensionality reduction methods. In this paper, we propose an approach that allows the problem of the “curse of dimensionality” to be solved thanks to an efficient combination of convolutional neural networks and dimensionality reduction methods. Our contribution consists of defining the best combination approach between the CNN layers and the regional maximum activation of convolutions (RMAC) method and its variants. With our combined approach, we propose providing reduced descriptors that will accelerate the research time and reduce the storage space while maintaining precision. We conclude by proposing the best position of an RMAC layer with an increase in accuracy ranging from 4.03% to 27.34%, a decrease in research time ranging from 89.66% to 98.14% in the function of CNN architecture, and a reduction in the size of the descriptor vector by 97.96% on the GHIM-10K benchmark database.
Indexing images by content is one of the most used computer vision methods, where various techniques are used to extract visual characteristics from images. The deluge of data surrounding us, due the high use of social and diverse media acquisition systems, has created a major challenge for classical multimedia processing systems. This problem is referred to as the ‘curse of dimensionality’. In the literature, several methods have been used to decrease the high dimension of features, including principal component analysis (PCA) and locality sensitive hashing (LSH). Some methods, such as VA-File or binary tree, can be used to accelerate the search phase. In this paper, we propose an efficient approach that exploits three particular methods, those being PCA and LSH for dimensionality reduction, and the VA-File method to accelerate the search phase. This combined approach is fast and can be used for high dimensionality features. Indeed, our method consists of three phases: (1) image indexing within SIFT and SURF algorithms, (2) compressing the data using LSH and PCA, and (3) finally launching the image retrieval process, which is accelerated by using a VA-File approach.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.