Ping Tan scite author profile

Conditional Generative Adversarial Networks (GANs) for cross-domain image-to-image translation have made much progress recently [7,8,21,12,4,18]. Depending on the task complexity, thousands to millions of labeled image pairs are needed to train a conditional GAN. However, human labeling is expensive, even impractical, and large quantities of data may not always be available. Inspired by dual learning from natural language translation [23], we develop a novel dual-GAN mechanism, which enables image translators to be trained from two sets of unlabeled images from two domains. In our architecture, the primal GAN learns to translate images from domain U to those in domain V , while the dual GAN learns to invert the task. The closed loop made by the primal and dual tasks allows images from either domain to be translated and then reconstructed. Hence a loss function that accounts for the reconstruction error of images can be used to train the translators. Experiments on multiple image translation tasks with unlabeled data show considerable performance gain of Du-alGAN over a single GAN. For some tasks, DualGAN can even achieve comparable or slightly better results than conditional GAN trained on fully labeled data.

show abstract

Cascade Cost Volume for High-Resolution Multi-View Stereo and Stereo Matching

Gu¹,

Fan²,

Zhu³

et al. 2020

641

546

View full text Add to dashboard Cite

Bundled camera paths for video stabilization

et al. 2013

View full text Add to dashboard Cite

a) a single global path (b) our bundled paths Figure 1: Comparison between traditional 2D stabilization (a single global camera path) and our bundled camera paths stabilization. We plot the camera trajectories (visualized by the y-axis translation over time) and show the original path (red) and the smoothed path (blue) for both methods. Our bundled paths rely on a 2D mesh-based motion representation, and are smoothed in space-time. AbstractWe present a novel video stabilization method which models camera motion with a bundle of (multiple) camera paths. The proposed model is based on a mesh-based, spatially-variant motion representation and an adaptive, space-time path optimization. Our motion representation allows us to fundamentally handle parallax and rolling shutter effects while it does not require long feature trajectories or sparse 3D reconstruction. We introduce the 'as-similaras-possible' idea to make motion estimation more robust. Our space-time path smoothing adaptively adjusts smoothness strength by considering discontinuities, cropping size and geometrical distortion in a unified optimization framework. The evaluation on a large variety of consumer videos demonstrates the merits of our method.

show abstract

A Benchmark Dataset and Evaluation for Non-Lambertian and Uncalibrated Photometric Stereo

Shi

et al. 2019

IEEE Trans. Pattern Anal. Mach. Intell.

165

286

View full text Add to dashboard Cite

Classic photometric stereo is often extended to deal with real-world materials and work with unknown lighting conditions for practicability. To quantitatively evaluate non-Lambertian and uncalibrated photometric stereo, a photometric stereo image dataset containing objects of various shapes with complex reflectance properties and high-quality ground truth normals is still missing. In this paper, we introduce the 'DiLiGenT' dataset with calibrated Directional Lightings, objects of General reflectance with different shininess, and 'ground Truth' normals from high-precision laser scanning. We use our dataset to quantitatively evaluate state-of-the-art photometric stereo methods for general materials and unknown lighting conditions, selected from a newly proposed photometric stereo taxonomy emphasizing on non-Lambertian and uncalibrated methods. The dataset and evaluation results are made publicly available, and we hope it can serve as a benchmark platform that inspires future research.

show abstract

PanoContext: A Whole-Room 3D Context Model for Panoramic Scene Understanding

et al. 2014

View full text Add to dashboard Cite

Abstract. The field-of-view of standard cameras is very small, which is one of the main reasons that contextual information is not as useful as it should be for object detection. To overcome this limitation, we advocate the use of 360• full-view panoramas in scene understanding, and propose a whole-room context model in 3D. For an input panorama, our method outputs 3D bounding boxes of the room and all major objects inside, together with their semantic categories. Our method generates 3D hypotheses based on contextual constraints and ranks the hypotheses holistically, combining both bottom-up and top-down context information. To train our model, we construct an annotated panorama dataset and reconstruct the 3D model from single-view using manual annotation. Experiments show that solely based on 3D context without any image-based object detector, we can achieve a comparable performance with the state-of-the-art object detector. This demonstrates that when the FOV is large, context is as powerful as object appearance. All data and source code are available online.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Ping Tan

DualGAN: Unsupervised Dual Learning for Image-to-Image Translation

Cascade Cost Volume for High-Resolution Multi-View Stereo and Stereo Matching

Bundled camera paths for video stabilization

A Benchmark Dataset and Evaluation for Non-Lambertian and Uncalibrated Photometric Stereo

PanoContext: A Whole-Room 3D Context Model for Panoramic Scene Understanding

Contact Info

Product

Resources

About