Rapid growth in the transfer of multimedia information over the Internet requires algorithms to retrieve a queried image from large image database repositories. The proposed content-based image retrieval (CBIR) uses Gaussian-Hermite moments as the low-level features. Later these features are compressed with principal component analysis. The compressed feature set is multiplied with the weight matrix array, which has the same size as the feature vector. Hybrid firefly and grey wolf optimization (FAGWO) is used to prevent the premature convergence of optimization in the firefly algorithm. The retrieval of images in CBIR is carried out in an OpenCV python environment with K-nearest neighbours and random forest algorithm classifiers. The fitness function for FAGWO is the accuracy of the classifier. The FAGWO algorithm derives the optimum weights from a randomly generated initial population. When these optimized weights are applied, the proposed algorithm shows better precision/recall and efficiency than other techniques such as exact legendre moments, Region-based image retrieval, K-means clustering and Color descriptor waveletbased texture descriptor retrieval technique. In terms of optimization, hybrid FAGWO outperformed various optimization techniques (when used alone) like Particle Swarm Optmization, Genetic Algorithm, Grey-Wolf Optimization and FireFly algorithm.This is an open access article under the terms of the Creative Commons Attribution-NonCommercial-NoDerivs License, which permits use and distribution in any medium, provided the original work is properly cited, the use is non-commercial and no modifications or adaptations are made.
Monocular depth estimation is a hot research topic in autonomous car driving. Deep convolution neural networks (DCNN) comprising encoder and decoder with transfer learning are exploited in the proposed work for monocular depth map estimation of two-dimensional images. Extracted CNN features from initial stages are later upsampled using a sequence of Bilinear UpSampling and convolution layers to reconstruct the depth map. The encoder forms the feature extraction part, and the decoder forms the image reconstruction part. EfficientNetB0, a new architecture is used with pretrained weights as encoder. It is a revolutionary architecture with smaller model parameters yet achieving higher efficiencies than the architectures of state-of-the-art, pretrained networks. EfficientNet-B0 is compared with two other pretrained networks, the DenseNet-121 and ResNet50 models. Each of these three models are used in encoding stage for features extraction followed by bilinear method of UpSampling in the decoder. The Monocular image is an ill-posed problem and is thus considered as a regression problem. So the metrics used in the proposed work are F1-score, Jaccard score and Mean Actual Error (MAE) etc., between the original and the reconstructed image. The results convey that EfficientNet-B0 outperforms in validation loss, F1-score and Jaccard score compared to DenseNet-121 and ResNet-50 models.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.