For more than a decade, both academia and industry have focused attention on the computer vision and in particular the computational color constancy (CVCC). The CVCC is used as a fundamental preprocessing task in a wide range of computer vision applications. While our human visual system (HVS) has the innate ability to perceive constant surface colors of objects under varying illumination spectra, the computer vision is facing the color constancy challenge in nature. Accordingly, this article proposes novel convolutional neural network (CNN) architecture based on the residual neural network which consists of pre-activation, atrous or dilated convolution and batch normalization. The proposed network can automatically decide what to learn from input image data and how to pool without supervision. When receiving input image data, the proposed network crops each image into image patches prior to training. Once the network begins learning, local semantic information is automatically extracted from the image patches and fed to its novel pooling layer. As a result of the semantic pooling, a weighted map or a mask is generated. Simultaneously, the extracted information is estimated and combined to form global information during training. The use of the novel pooling layer enables the proposed network to distinguish between useful data and noisy data, and thus efficiently remove noisy data during learning and evaluating. The main contribution of the proposed network is taking CVCC to higher accuracy and efficiency by adopting the novel pooling method. The experimental results demonstrate that the proposed network outperforms its conventional counterparts in estimation accuracy.
In the human and computer vision, color constancy is the ability to perceive the true color of objects in spite of changing illumination conditions. Color constancy is remarkably benefitting human and computer vision issues such as human tracking, object and human detection and scene understanding. Traditional color constancy approaches based on the gray world assumption fall short of performing a universal predictor, but recent color constancy methods have greatly progressed with the introduction of convolutional neural networks (CNNs). Yet, shallow CNN-based methods face learning capability limitations. Accordingly, this article proposes a novel color constancy method that uses a multi-stream deep neural network (MSDNN)-based convoluted mixture of deep experts (CMoDE) fusion technique in performing deep learning and estimating local illumination. In the proposed method, the CMoDE fusion technique is used to extract and learn spatial and spectral features in an image space. As opposed to previous works where residual networks stack multiple layers linearly and concatenate multiple paths, the proposed method distinctively piles up layers both in series and in parallel, selects and concatenates effective paths in the CMoDE-based DCNN. As a result, the proposed CMoDE-based DCNN brings significant progress to efficiency and accuracy of using computing resources and estimating illuminants, respectively. In the experiments, Shi's Reprocessed, gray-ball and NUS-8 Camera datasets are used to prove illumination and camera invariants. The experimental results illustrate that this new method surpasses its conventional counterparts. INDEX TERMS color constancy, CMoDE fusion technique, multi-stream deep neural network (MSDNN), illumination estimation, residual networks.
Abstract. Tone-mapping operators are the typical algorithms designed to produce visibility and the overall impression of brightness, contrast, and color of high dynamic range (HDR) images on low dynamic range (LDR) display devices. Although several new tone-mapping operators have been proposed in recent years, the results of these operators have not matched those of the psychophysical experiments based on the human visual system. A color-rendering model that is a combination of tone-mapping and cone-response functions using an XYZ tristimulus color space is presented. In the proposed method, the tone-mapping operator produces visibility and the overall impression of brightness, contrast, and color in HDR images when mapped onto relatively LDR devices. The tone-mapping resultant image is obtained using chromatic and achromatic colors to avoid well-known color distortions shown in the conventional methods. The resulting image is then processed with a cone-response function wherein emphasis is placed on human visual perception (HVP). The proposed method covers the mismatch between the actual scene and the rendered image based on HVP. The experimental results show that the proposed method yields an improved color-rendering performance compared to conventional methods. © The Authors. Published by SPIE under a Creative CommonsAttribution 3.0 Unported License. Distribution or reproduction of this work in whole or in part requiresfull attribution of the original publication, including its DOI.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.