Semantic scene classification is a useful, yet challenging problem in image understanding. Most existing systems are based on low-level features, such as color or texture, and succeed to some extent. Intuitively, semantic features, such as sky, water, or foliage, which can be detected automatically, should help close the so-called semantic gap and lead to higher scene classification accuracy. To answer the question of how accurate the detectors themselves need to be, we adopt a generally applicable scene classification scheme that combines semantic features and their spatial layout as encoded implicitly using a block-based method. Our scene classification results show that although our current detectors collectively are still inadequate to outperform low-level features under the same scheme, semantic features hold promise as simulated detectors can achieve superior classification accuracy once their own accuracies reach above a nontrivial 90%.
This paper presents an approach to predicting image quality by spatially filtering images before generating color difference maps with pixel-based color difference metrics. The resulting difference maps can then be pooled across the whole image. This approach was originally developed for CIELAB color space under the name S-CIELAB. We extend this approach to use the recently developed IC T C P color space to improve the prediction accuracy for high dynamic range and wide color gamut images. The filtering is based on the chromatic and achromatic contrast sensitivity function of the human visual system. Our results on four existing subjective image quality databases containing high dynamic range and wide color gamut images show substantial improvements at low computational cost, outperforming existing color difference metrics.
Abstract-We present an approach to identify non-cooperative individuals at a distance from a sequence of images using 3D face models. Most biometric features (such as fingerprints, hand shape, iris or retinal scans) require cooperative subjects in close proximity to the biometric system. We process images acquired with an ultra-high resolution video camera, infer the location of the subjects' head, use this information to crop the region of interest, build a 3D face model, and use this 3D model to perform biometric identification. To build the 3D model, we use an image sequence, as natural head and body motion provides enough viewpoint variation to perform stereo-motion for 3D face reconstruction. Experiments using a 3D matching engine suggest the feasibility of proposed approach for recognition against 3D galleries. I. INTRODUCTIONHE field of biometrics has seen rapid growth in the last few years, both in advances in scientific knowledge and in commercial applications. Many of biometric features that are highly distinctive and have permanence (such as fingerprints, iris or retinal scans) require a cooperative subject in close proximity to the system [1]. Such features become unusable when we must deal with a non-cooperative individual whom we wish to observe unobtrusively and at a distance, as required for many security applications.Facial features can be measured at a distance, and without cooperation, or even notice, by the observed individuals. Unfortunately, even the best 2D face recognition systems today are neither reliable nor accurate enough for arbitrary lighting and pose in unconstrained environments [2,3]. 3D face recognition is receiving substantial attention because it is commonly thought that the use of 3D shape matching might overcome the fundamental limitations of 2D recognition. The main advantages of using 3-D for recognition are pose and lighting variation compensation. It appears that recognition using 3D, especially combined with 2D, holds significant promise, and could reach accuracy comparable to other biometric features such as fingerprints and iris.The majority of 3D face recognition research and commercial 3D face recognition systems use range sensors. Stereo camera, laser scanner, and structured light are the typical ranges sensors that give the Euclidean 3D shape information from a face. Bowyer et al. [4] point out the desirable properties for an ideal 3D sensor for face recognition applications based on image acquisition time, depth of filed, robust operation under lighting conditions, eye safety issues, and space/depth resolution, and none of currently available 3D sensors meet these requirements. It seems that 3D face recognition using active 3D range sensors is appropriate only at a close distance.We propose instead to perform 3D face recognition using a 3D face model generated from sequence of images at a distance. To build the 3D model, we use an image sequence, as natural head and body motion provides enough viewpoint variation to perform stereo-motion for 3D face reconstructi...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.