We introduce a deep learning based framework for modeling dynamic hairs from monocular videos, which could be captured by a commodity video camera or downloaded from Internet. The framework mainly consists of two neural networks, i.e., HairSpatNet for inferring 3D spatial features of hair geometry from 2D image features, and HairTempNet for extracting temporal features of hair motions from video frames. The spatial features are represented as 3D occupancy fields depicting the hair volume shapes and 3D orientation fields indicating the hair growing directions. The temporal features are represented as bidirectional 3D warping fields, describing the forward and backward motions of hair strands cross adjacent frames. Both HairSpatNet and HairTempNet are trained with synthetic hair data. The spatial and temporal features predicted by the networks are subsequently used for growing hair strands with both spatial and temporal consistency. Experiments demonstrate that our method is capable of constructing plausible dynamic hair models that closely resemble the input video, and compares favorably to previous single-view techniques.
In this paper, we present iOrthoPredictor, a novel system to visually predict teeth alignment in photographs. Our system takes a frontal face image of a patient with visible malpositioned teeth along with a corresponding 3D teeth model as input, and generates a facial image with aligned teeth, simulating a real orthodontic treatment effect. The key enabler of our method is an effective disentanglement of an explicit representation of the teeth geometry from the in-mouth appearance, where the accuracy of teeth geometry transformation is ensured by the 3D teeth model while the in-mouth appearance is modeled as a latent variable. The disentanglement enables us to achieve fine-scale geometry control over the alignment while retaining the original teeth appearance attributes and lighting conditions. The whole pipeline consists of three deep neural networks: a U-Net architecture to explicitly extract the 2D teeth silhouette maps representing the teeth geometry in the input photo, a novel multilayer perceptron (MLP) based network to predict the aligned 3D teeth model, and an encoder-decoder based generative model to synthesize the in-mouth appearance conditional on the original teeth appearance and the aligned teeth geometry. Extensive experimental results and a user study demonstrate that iOrthoPredictor is effective in qualitatively predicting teeth alignment, and applicable to the orthodontic industry.
Active soft bodies can affect their shape through an internal actuation mechanism that induces a deformation. Similar to recent work, this paper utilizes a differentiable, quasi-static, and physics-based simulation layer to optimize for actuation signals parameterized by neural networks. Our key contribution is a general and implicit formulation to control active soft bodies by defining a function that enables a continuous mapping from a spatial point in the material space to the actuation value. This property allows us to capture the signal's dominant frequencies, making the method discretization agnostic and widely applicable. We extend our implicit model to mandible kinematics for the particular case of facial animation and show that we can reliably reproduce facial expressions captured with high-quality capture systems. We apply the method to volumetric soft bodies, human poses, and facial expressions, demonstrating artist-friendly properties, such as simple control over the latent space and resolution invariance at test time.
Controlling stroke size in Fast Style Transfer remains a difficult task. So far, only a few attempts have been made towards it, and they still exhibit several deficiencies regarding efficiency, flexibility, and diversity. In this paper, we aim to tackle these problems and propose a recurrent convolutional neural subnetwork, which we call recurrent stroke‐pyramid, to control the stroke size in Fast Style Transfer. Compared to the state‐of‐the‐art methods, our method not only achieves competitive results with much fewer parameters but provides more flexibility and efficiency for generalizing to unseen larger stroke size and being able to produce a wide range of stroke sizes with only one residual unit. We further embed the recurrent stroke‐pyramid into the Multi‐Styles and the Arbitrary‐Style models, achieving both style and stroke‐size control in an entirely feed‐forward manner with two novel run‐time control strategies.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.