Multimedia content analysis is applied in different real-world computer vision applications, and digital images constitute a major part of multimedia data. In last few years, the complexity of multimedia contents, especially the images, has grown exponentially, and on daily basis, more than millions of images are uploaded at different archives such as Twitter, Facebook, and Instagram. To search for a relevant image from an archive is a challenging research problem for computer vision research community. Most of the search engines retrieve images on the basis of traditional text-based approaches that rely on captions and metadata. In the last two decades, extensive research is reported for content-based image retrieval (CBIR), image classification, and analysis. In CBIR and image classification-based models, high-level image visuals are represented in the form of feature vectors that consists of numerical values. The research shows that there is a significant gap between image feature representation and human visual understanding. Due to this reason, the research presented in this area is focused to reduce the semantic gap between the image feature representation and human visual understanding. In this paper, we aim to present a comprehensive review of the recent development in the area of CBIR and image representation. We analyzed the main aspects of various image retrieval and image representation models from low-level feature extraction to recent semantic deep-learning approaches. The important concepts and major research studies based on CBIR and image representation are discussed in detail, and future research directions are concluded to inspire further research in this area.
Recently, face datasets containing celebrities photos with facial makeup are growing at exponential rates, making their recognition very challenging. Existing face recognition methods rely on feature extraction and reference reranking to improve the performance. However face images with facial makeup carry inherent ambiguity due to artificial colors, shading, contouring, and varying skin tones, making recognition task more difficult. The problem becomes more confound as the makeup alters the bilateral size and symmetry of the certain face components such as eyes and lips affecting the distinctiveness of faces. The ambiguity becomes even worse when different days bring different facial makeup for celebrities owing to the context of interpersonal situations and current societal makeup trends. To cope with these artificial effects, we propose to use a deep convolutional neural network (dCNN) using augmented face dataset to extract discriminative features from face images containing synthetic makeup variations. The augmented dataset containing original face images and those with synthetic make up variations allows dCNN to learn face features in a variety of facial makeup. We also evaluate the role of partial and full makeup in face images to improve the recognition performance. The experimental results on two challenging face datasets show that the proposed approach can compete with the state of the art.
There are different applications of computer vision and digital image processing in various applied domains and automated production process. In textile industry, fabric defect detection is considered as a challenging task as the quality and the price of any textile product are dependent on the efficiency and effectiveness of the automatic defect detection. Previously, manual human efforts are applied in textile industry to detect the defects in the fabric production process. Lack of concentration, human fatigue, and time consumption are the main drawbacks associated with the manual fabric defect detection process. Applications based on computer vision and digital image processing can address the abovementioned limitations and drawbacks. Since the last two decades, various computer vision-based applications are proposed in various research articles to address these limitations. In this review article, we aim to present a detailed study about various computer vision-based approaches with application in textile industry to detect fabric defects. The proposed study presents a detailed overview of histogram-based approaches, color-based approaches, image segmentation-based approaches, frequency domain operations, texture-based defect detection, sparse feature-based operation, image morphology operations, and recent trends of deep learning. The performance evaluation criteria for automatic fabric defect detection is also presented and discussed. The drawbacks and limitations associated with the existing published research are discussed in detail, and possible future research directions are also mentioned. This research study provides comprehensive details about computer vision and digital image processing applications to detect different types of fabric defects.
The requirement for effective image search, which motivates the use of Content-Based Image Retrieval (CBIR) and the search of similar multimedia contents on the basis of user query, remains an open research problem for computer vision applications. The application domains for Bag of Visual Words (BoVW) based image representations are object recognition, image classification and content-based image analysis. Interest point detectors are quantized in the feature space and the final histogram or image signature do not retain any detail about co-occurrences of features in the 2D image space. This spatial information is crucial, as it adversely affects the performance of an image classification-based model. The most notable contribution in this context is Spatial Pyramid Matching (SPM), which captures the absolute spatial distribution of visual words. However, SPM is sensitive to image transformations such as rotation, flipping and translation. When images are not well-aligned, SPM may lose its discriminative power. This paper introduces a novel approach to encoding the relative spatial information for histogram-based representation of the BoVW model. This is established by computing the global geometric relationship between pairs of identical visual words with respect to the centroid of an image. The proposed research is evaluated by using five different datasets. Comprehensive experiments demonstrate the robustness of the proposed image representation as compared to the state-of-the-art methods in terms of precision and recall values.
Facial palsy caused by nerve damage results in loss of facial symmetry and expression. A reliable palsy grading system for large-scale applications is still missing in the literature. Although numerous approaches have been reported on facial palsy quantification and grading, most employ hand-crafted features on relatively smaller datasets which limit the classification accuracy due to non-optimal face representation. In contrast, convolutional neural networks (CNNs) automatically learn the discriminative features facilitating the accurate classification of underlying tasks. In this paper, we propose to apply a typical deep network on a large dataset to extract palsy-specific features from face images. To prevent the inherent limitation of overfitting frequently occurring in CNNs, a generative adversial network (GAN) is applied to augment the training dataset. The deeply learned features are then used to classify the palsy disease into five benchmarked grades. The experimental results show that the proposed approach offers superior palsy grading performance compared to some existing methods. Such an approach is useful for palsy grading at large scale, such as primary health care.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.