The last two decades have seen an escalating interest in methods for large-scale unconstrained face recognition. While the promise of computer vision systems to efficiently and accurately verify and identify faces in naturally occurring circumstances still remains elusive, recent advances in deep learning are taking us closer to human-level recognition. In this study, the authors propose a new paradigm which employs deep features in a feature extractor and intra-personal factor analysis as a recogniser. The proposed new strategy represents the face changes of a person using identity specific components and the intra-personal variation through reinterpretation of a Bayesian generative factor analysis model. The authors employ the expectation-maximisation algorithm to calculate model parameters which cannot be observed directly. Recognition outcomes achieved through benchmarking on large-scale wild databases, Labeled Faces in the Wild (LFW) and Youtube Face (YTF), clearly prove that the proposed approach provides remarkable face verification performance improvement over state-of-the-art approaches.
Synthesizing images from text descriptions has become an active research area with the advent of Generative Adversarial Networks. The main goal here is to generate photo-realistic images that are aligned with the input descriptions. Text-to-Face generation(T2F) is a sub-domain of Text-to-Image generation(T2I) that is more challenging due to the complexity and variation of facial attributes. It has a number of applications mainly in the domain of public safety. Even though several models are available for T2F, there is still the need to improve the image quality and the semantic alignment. In this research, we propose a novel framework, to generate facial images that are well-aligned with the input descriptions. Our framework utilizes the highresolution face generator, StyleGAN2, and explores the possibility of using it in T2F. Here, we embed text in the input latent space of StyleGAN2 using BERT embeddings and oversee the generation of facial images using text descriptions. We trained our framework on attributebased descriptions to generate images of 1024x1024 in resolution. The images generated exhibit a 57% similarity to the ground truth images, with a face semantic distance of 0.92, outperforming state-of-the-artwork. The generated images have a FID score of 118.097 and the experimental results show that our model generates promising images.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.