EXECUTIVE SUMMARYBackground -Facial recognition algorithms from seven commercial providers, and three universities, were tested on one laboratory dataset and two operational face recognition datasets, one comprised of visa images, the other law enforcement mugshots. The population represented in these sets approaches 4 million, such that this report documents the largest public evaluation of face recognition technology to date. The project attracted participation from a majority of the known providers of FR technology including the largest commercial suppliers.-Accuracy was measured for three applications: One-to-one verification (e.g. of e-passport holders); one-to-one verification against a claimed identity in an enrolled database (e.g. for driver's license re-issuance); and one-tomany search (e.g. for criminal identification or driver's license duplicate detection).-Face images have been collected in law enforcement for more than a century, but their value for automated identification remains secondary to fingerprints. In a criminal investigation setting, face recognition has been used both in an automated mode and for forensic investigation. However, the limits of the technology have not previously been quantified publicly, and, in any case, are subject to improvement over time, and to the properties of the images in use.-Core algorithmic capability is the major contributor to application-level recognition outcomes. A second critical factor is the quality of the input images; this is influenced by design of, and adherence to, image capture protocols (as codified by face recognition standards) and also by the behavior of the person being photographed (e.g. whether they face the camera). Some data collection protocols can embed a human adjudication of quality (e.g. of a visa image by a consular official) while others cannot maintain such tight quality controls (e.g. because of non-cooperative subjects in police booking processes).-This is the first time NIST has reported accuracy of face identification algorithms. Prior tests have assumed an equivalence of a 1:N search as N 1:1 comparisons. This new protocol formally supports use of fast search algorithms such as indexing, partitioning and binning. The benefits are more accurate predictions of scalability to national-size populations.-The project used archival imagery to assess core algorithmic capability of algorithms. It did not do an instrumented collection of images as might be used in a scenario or operational test. It therefore did not measure human-camera transactional performance parameters such as duration of use and outcome. These would be of vital interest in, for example, e-Passport gate applications. Core Accuracy-As with other biometrics, recognition accuracy depends strongly on the provider of the core technology. Broadly, there is an order of magnitude between the best and worst identification error rates.-Biometric identification algorithms return candidate lists. These enumerate hypothesized identities for a search sample. Face identification...
Purpose: This work was conducted determine whether iris recognition accuracy decreases with the time lapsed between collection of initial enrollment and recognition images. More specifically, it seeks to quantify accuracy changes associated with any permanent changes to the iris and its proximal anatomy. This study is intended to quantify natural ageing effects in a healthy population; medical conditions and injuries can rapidly and severely affect recognition, so these are out of scope.Background: Stability is a required definitional property for a biometric to be useful. Quantitative statements of stability are operationally important as they dictate re-enrollment schedules e.g. of a face on a passport. The ophthalmologists who filed initial patents on iris recognition posited the iris to be "extremely stable" over "many years" but that "features which do develop" do so "rather slowly" [31]. A further patent held that irises have "texture of high complexity, that prove to be immutable over a person's life" [21]. This view held until several recent empirical studies suggested otherwise. Those studies, and ours, were motivated to check the veracity of the 1994 patent's assertion that an enrolled iris can be viable over decades. Two studies, using separate iris image collections from the University of Notre Dame, reported a large increase in false rejection rates [8,29]. The studies made attempts to account for several possible causes of the observed ageing, but could not conclude that the iris texture itself was changing. Their results, however, were widely reported [59,24, 3] with statements such as "irises, rather than being stable over a lifetime, are susceptible to ageing effects that steadily change the appearance over time" [33]. A further study, however, identified pupil-dilation[27] as the primary causal variable. Operational iris systems have identified individuals over periods up to 10 years[5] and 7 years [6].Conclusions: Using two large operational datasets, we find no evidence of a widespread iris ageing effect. Specifically, the population statistics (mean and variance) are constant over periods of up to nine years. This is consistent with the ability to enroll most individuals and see no degradation in overall recognition accuracy. Furthermore, we compute an ageing rate for how quickly recognition degrades with changes in the iris anatomy; this estimate suggests that iris recognition of average individuals will remain viable over decades. However, given the large population sizes, we identify a small percentage of individuals whose recognition scores do degrade consistent with disease or an ageing effect. These results are confined to adult populations. Additionally, we show that the template ageing reported in the Notre Dame studies is largely due to systematic dilation change over the collection period. Pupil dilation varies under environmental and several biological influences, with variations occuring on timescales ranging from below one second up to several decades. Our data suggests that the...
The paper measures the ability of face recognition algorithms to distinguish between identical twin siblings. The experimental dataset consists of images taken of 126 pairs of identical twins (252 people) collected on the same day and 24 pairs of identical twins (48 people) with images collected one year apart. In terms of both the number of paris of twins and lapsed time between acquisitions, this is the most extensive investigation of face recognition performance on twins to date. Recognition experiments are conducted using three of the top submissions to the Multiple Biometric Evaluation (MBE) 2010 Still Face Track [1]. Performance results are reported for both same day and cross year matching. Performance results are broken out by lighting conditions (studio and outside); expression (neutral and smiling); gender and age. Confidence intervals were generated by a bootstrap method. This is the most detailed covariate analysis of face recognition of twins to date.
Tatt-C was conducted as an "open-book" test, where participants were provided with the dataset and ground-truth data, ran their algorithm(s) on the data following a specified protocol on their own hardware, and provided their system output to NIST for uniform scoring and analysis. Accuracy was measured for the five Tatt-C use cases, including the impact of gallery size for certain scenarios. Detailed descriptions and image examples of the use cases can be found in Section 2.2 of this report. Key Results Key results for the five use cases studied are: • Tattoo Identification evaluated matching different instances of the same tattoo image from the same subject over time. On a gallery size of 4 375, the top performing algorithm (MorphoTrak) reported a rank 10 hit rate ⇤ of 99.4% and mean average precision (MAP) ⇤ of 99.4%. Section 3.1 • Region of Interest evaluated matching a subregion of interest that is contained in a larger image canvas. On a gallery size of 4 363, the top performing algorithm (MorphoTrak) reported a rank 10 hit rate of 97% and MAP of 95.4%. Section 3.2 • Mixed Media evaluated matching visually similar or related tattoos using different types of non-tattoo imagery (i.e. sketches, scanned print, computer graphics, and graffiti). On a gallery size of 55, the top performing algorithm (MITRE) reported a rank 10 hit rate of 36.5% and MAP of 15.1%. Section 3.3 • Tattoo Similarity evaluated matching visually similar or related tattoos from different subjects. On a gallery size of 272, the top performing algorithm (MITRE) achieves a rank 10 accuracy of 14.9% and MAP of 5.2%. Section 3.4 • Tattoo Detection evaluated detecting whether an image contains a tattoo or not. On a mixed dataset of 1 349 tattoo images and 1 000 face images, the top performing algorithm (MorphoTrak) reported an overall detection accuracy ⇤ of 96.3%. Section 3.5 Factors that influenced accuracy included: • Algorithms: Tattoo detection and matching accuracy depends strongly on the implementation of the core technology as algorithm performance varied substantially. Sections 3.1, 3.2, 3.3, 3.4, and 3.5 ⇤ For the definition of hit rate, MAP, and overall detection accuracy, see Section 2.3. Generally speaking, the higher the hit rate, MAP, and detection accuracy value, the more accurate the algorithm. Finally, the authors are grateful to Amanda Noxon (Michigan State Police), Dr. Jim Matey (NIST), and Mike Garris (NIST) for their thorough and constructive review of this document. Release Notes. Versioning: This document is Revision 1.0 of the original report, which was originally published in September 2015.. Typesetting: Virtually all of the tabulated content in this report was produced automatically. This involved the use of scripting tools to generate directly type-settable L A T E X content. This improves timeliness, flexibility, maintainability, and reduces transcription errors.. Graphics: Many of the figures in this report were produced using Hadley Wickham's ggplot2 [29] package run ning under , the capabilities of which extend bey...
The report is organized with an executive summary, a high-level background, and a technical summary preceeding the main body of the report which gives more detailed information on participation, test design, performance metrics, datasets, and the results. � Overview: This report documents the Face in Video Evaluation (FIVE), an independent, public test of face recognition of non-cooperating subjects who are recorded passively and are mostly oblivious to the presence of cameras. The report enumerates accuracy and speed of face recognition algorithms applied to the identification of persons appearing in video sequences drawn from six different video datasets mostly sequestered at NIST. These datasets represent video surveil lance, chokepoint, and social media applications of the technology. In six cases, videos from fixed cameras are searched against portrait-style photographs of up to 48 000 enrolled identities. In one case, videos are searched against faces enrolled from other videos. Additionally, the effect of supplementing enrollment with non-frontal images is examined.� Participation: FIVE was open to any organization worldwide, at no charge. The research arms of sixteen major com mercial suppliers of face recognition technologies submitted thirty six algorithms, allowing FIVE to document a robust comparative evaluation. The algorithms were submitted to NIST in December 2015 so this report does not capture re search and development gains since then. The algorithms are research prototypes, evaluated as black boxes without developer training or tuning. They implement a NIST-specified interface and so may not be immediately commercially available. They run without graphical processing units (GPU).� Difficulty: Face recognition is much more difficult in non-cooperative video than with traditional portrait-style photos.The initial face detection task is non-trivial because a scene may contain no faces or may contain many, and these can appear over a range of resolutions (scales), orientations (poses), and illumination conditions. Second, subjects move, so their faces must be tracked through time and this is harder when motion blur occurs or when a face is occluded by closer persons or objects. Third, resolution in video is compromised by optical tradeoffs (magnification, field of view, depth of field, cost) and then by compression used to satisfy data rate or storage limitations. Finally, other adverse aspects of image quality and face presentation degrade recognition scores so that scores from unrelated individuals can be similarly high, making discrimination between known and unknown individuals error prone. This leads to the possibility that a member of the public can be falsely matched to someone on a watchlist; the occurence of such hazards is mitigated by elevating a recognition threshold.� Key conclusions: This study was conducted to support new and existing applications of face recognition, particularly to assess viability and technology readiness. These range from surveillance, to expedited single-facto...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.