Dynamic 3D reconstruction improvement via intensity video guided 4D fusion

Zhang, Jie; Maniatis, Christos; Horna, Luis; Fisher, Robert B.

doi:10.1016/j.jvcir.2018.07.007

Cited by 6 publications

(4 citation statements)

References 26 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The raw 3D point cloud sequences usually suffer from some spatial noise and temporal fluctuations, due to the sensor technology and data capture procedure. To improve the overall quality of the 3D data, we firstly denoise the 3D point cloud sequence using a multi-frame fusion algorithm [21], but do not reduce the frame rate. On the other hand, facial pose is likely to slightly change while a person is speaking.…”

Section: Preprocessing 3d Lip Sequencementioning

confidence: 99%

3D Lip Event Detection via Interframe Motion Divergence at Multiple Temporal Resolutions

Zhang¹,

Fisher²

2021

Preprint

Self Cite

View full text Add to dashboard Cite

The lip is a dominant dynamic facial unit when a person is speaking. Detecting lip events is beneficial to speech analysis and support for the hearing impaired. This paper proposes a 3D lip event detection pipeline that automatically determines the lip events from a 3D speaking lip sequence. We define a motion divergence measure using 3D lip landmarks to quantify the interframe dynamics of a 3D speaking lip. Then, we cast the interframe motion detection in a multi-temporal-resolution framework that allows the detection to be applicable to different speaking speeds. The experiments on the S3DFM Dataset investigate the overall 3D lip dynamics based on the proposed motion divergence. The proposed 3D pipeline is able to detect opening and closing lip events across 100 sequences, achieving a state-of-the-art performance.

show abstract

Section: Preprocessing 3d Lip Sequencementioning

confidence: 99%

3D Lip Event Detection via Interframe Motion Divergence at Multiple Temporal Resolutions

Zhang¹,

Fisher²

2021

Preprint

Self Cite

View full text Add to dashboard Cite

show abstract

“…[7] Dynamic three-dimensional reconstruction improvement algorithm based on intensity videoguided multi-frame four-dimensional fusion. [8] Three-dimensional fusion framework with controlled regularization parameter which reduces noise at the time of data fusion for generating three-dimensional models. [9] The fusion of data from a one-dimensional laser device and a vision system based on depth estimation for pose estimation and reconstruction.…”

Section: Approach Referencementioning

confidence: 99%

“…Other applications fuse devices with data from coordinate measurement machines [7]. The fusion of multiple VL devices is also considered [8,9,17]. The fusion of infrared and VL devices is also a frequent topic in this sense [6,[14][15][16][17].…”

Section: Introductionmentioning

confidence: 99%

Metrological analysis of the three-dimensional reconstruction based on close-range photogrammetry and the fusion of long-wave infrared and visible-light images

Marcellino

Rosa

Pinto

2020

Meas. Sci. Technol.

View full text Add to dashboard Cite

This work proposes evaluating statistically the metrological performance of three-dimensional reconstructions built with fused long-wavelength infrared (LWIR) and visible-light (VL) images. The image fusion procedure was essentially based on two-dimensional wavelet transform and two pixel-level fusion rules: the maximum intensity level, presented in a previous work of the authors, and a new fusion rule, which replaces the VL information with the LWIR information in the region of the measured object on the images. The reconstructions of a translucent cube were performed with a point triangulation-based procedure and its dimension measurements were employed as evaluation criteria. The results show that the fused images have more contrast but also more artifacts. The fusion procedures generated denser reconstructions with at least 34.83% more points. Considering the metrological result, reconstructions with only visible-light images resulted in maximal 89.31% less measurement bias but at least 47.25% more uncertainty than the fusion ones. The new fusion rule provided the best results, with more points in the dense cloud and lower uncertainty. The work is important to provide a metrologically viable alternative for three-dimensional reconstruction of objects in situations of low contrast or poor texture information in the visible spectrum, and in which no target can be applied to the inspected part.

show abstract

“…More details can be seen in Fig.4. Meanwhile, from the corresponding 3D point cloud sequence, 4D spatio-temporal fusion guided by 2D intensity tracking [46] is performed to reduce 3D spatial noise and temporal fluctuations. Because the 2D and 3D images are registered, the 2D FLMs also specify the corresponding 3D FLMs.…”

Section: Proposed Behaviometrics 41 Overviewmentioning

confidence: 99%

3D Visual passcode: Speech-driven 3D facial dynamics for behaviometrics

Zhang

Fisher²

2019

Signal Processing

Self Cite

View full text Add to dashboard Cite

Face biometrics have achieved remarkable performance over the past decades, but unexpected spoofing of the static faces poses a threat to information security. There is an increasing demand for stable and discriminative biological modalities which are hard to be mimicked and deceived. Speech-driven 3D facial motion is a distinctive and measurable behavior-signature that is promising for biometrics. In this paper, we propose a novel 3D behaviometrics framework based on a "3D visual passcode" derived from speech-driven 3D facial dynamics. The 3D facial dynamics are jointly represented by 3D-keypoint-based measurements and 3D shape patch features, extracted from both static and speech-driven dynamic regions. An ensemble of subject-specific classifiers are then trained over selected discriminative features, which allows for a discriminant speech-driven 3D facial dynamics representation. We construct the first publicly available Speech-driven 3D Facial Motion dataset (S3DFM) that includes 2D-3D face video plus audio samples from 77 participants. The experimental results on the S3DFM show that the proposed pipeline achieves a face identification rate of 96.1%. Detailed discussions are presented, concerning anti-spoofing, head pose variation, video frame rate, and applicability cases. We also give comparison with other baselines on "deep" and "shallow" 2D face features.

show abstract

Dynamic 3D reconstruction improvement via intensity video guided 4D fusion

Cited by 6 publications

References 26 publications

3D Lip Event Detection via Interframe Motion Divergence at Multiple Temporal Resolutions

3D Lip Event Detection via Interframe Motion Divergence at Multiple Temporal Resolutions

Metrological analysis of the three-dimensional reconstruction based on close-range photogrammetry and the fusion of long-wave infrared and visible-light images

3D Visual passcode: Speech-driven 3D facial dynamics for behaviometrics

Contact Info

Product

Resources

About