We present a method for training a regression network from image pixels to 3D morphable model coordinates using only unlabeled photographs. The training loss is based on features from a facial recognition network, computed onthe-fly by rendering the predicted faces with a differentiable renderer. To make training from features feasible and avoid network fooling effects, we introduce three objectives: a batch distribution loss that encourages the output distribution to match the distribution of the morphable model, a loopback loss that ensures the network can correctly reinterpret its own output, and a multi-view identity loss that compares the features of the predicted 3D face and the input photograph from multiple viewing angles. We train a regression network using these objectives, a set of unlabeled photographs, and the morphable model itself, and demonstrate state-of-the-art results. nition network [25] into identity parameters for the Basel 2017 Morphable Face Model [8].to-image autoencoder with a fixed, morphable-model-based decoder and an image-based loss [28]. This paper presents a method for training a regression network that removes both the need for supervised training data and the reliance on inverse rendering to reproduce image pixels. Instead, the network learns to minimize a loss based on the facial identity features produced by a face recognition network such as VGG-Face [17] or Google's FaceNet [25]. These features are robust to pose, expression, lighting, and even non-photorealistic inputs. We exploit this 1 arXiv:1806.06098v1 [cs.CV]
Figure 1. This paper introduces Deep Structured Implicit Functions, a 3D shape representation that decomposes an input shape (mesh on left in every triplet) into a structured set of shape elements (colored ellipses on right) whose contributions to an implicit surface reconstruction (middle) are represented by latent vectors decoded by a deep network. AbstractThe goal of this project is to learn a 3D shape representation that enables accurate surface reconstruction, compact storage, efficient computation, consistency for similar shapes, generalization across diverse shape categories, and inference from depth camera observations. Towards this end, we introduce Deep Structural Implicit Functions (DSIF), a 3D shape representation that decomposes space into a structured set of local deep implicit functions. We provide networks that infer the space decomposition and local deep implicit functions from a 3D mesh or posed depth image. During experiments, we find that it provides 10.3 points higher surface reconstruction accuracy (F-Score) than the state-of-the-art (OccNet), while requiring fewer than 1% of the network parameters. Experiments on posed depth image completion and generalization to unseen classes show 15.8 and 17.8 point improvements over the state-of-the-art, while producing a structured 3D representation for each input with consistency across diverse shape collections.
No abstract
Train Static scene, moving camera MannequinChallenge (MC) Dataset Predicted depth Human Mask Initial depth from flow RGB Image MVS Depth (supervison) Inference Our depth predictions Moving people, moving camera
This paper presents the results of a study in which artists made line drawings intended to convey specific 3D shapes. The study was designed so that drawings could be registered with rendered images of 3D models, supporting an analysis of how well the locations of the artists' lines correlate with other artists', with current computer graphics line definitions, and with the underlying differential properties of the 3D surface. Lines drawn by artists in this study largely overlapped one another (75% are within 1mm of another line), particularly along the occluding contours of the object. Most lines that do not overlap contours overlap large gradients of the image intensity, and correlate strongly with predictions made by recent line drawing algorithms in computer graphics. 14% were not well described by any of the local properties considered in this study. The result of our work is a publicly available data set of aligned drawings, an analysis of where lines appear in that data set based on local properties of 3D models, and algorithms to predict where artists will draw lines for new scenes.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.