Introduction of the World Wide Web (WWW) and progressions in the field of multimedia and computer technology have enlarged the amount of image collections and databases including art galleries, digital libraries, and medical imageries that extend to millions of images. From these large-scale datasets, the retrieval procedure of images is carried out by classical approaches like chi square distance, colour histogram and text-based image retrieval, which however requires more time for getting the desired images. Thus, there is a need of proposing an effective retrieval system for images by handling these large numbers of images. For this reason, a new Content-Based Image Retrieval (CBIR) system has been implemented in this paper for retrieving the desired images of users from the collected images. Initially, different kinds of images like medical images, texture images, and environmental images etc., are collected from the standard image database. The deep feature extraction using Visual Geometry Group network (VGG-16), Inception v3, and Xception are used to get the features of all the images in the database. The parameters in these three deep learning techniques are tuned with an enhanced optimization algorithm of Modified Bypass-Rider Optimization Algorithm (MB-ROA). The features from VGG16, Inception and Xception are used for optimal weighted fused feature selection using enhanced ROA under training phase. During testing with query images, it undergoes deep feature extraction with the same set of techniques and, then fed into the optimal weighted fused feature selection to get the features of query image. Then, the multi-similarity function considering the cosine, Euclidean and Jaccard similarity are checked between the optimal features of database images and query image. The database images with the minimum similarity with the query image are retrieved from the database. The overall precision and recall to retrieve the top ten images related to the provided query image from the Corel dataset are 77.25% and 76.89%, respectively, while 70.15% and 70.26%, respectively from the VisTex dataset. The experimental findings show that the recommended model is effective at retrieving images from the database.