BlendedMVS: A Large-Scale Dataset for Generalized Multi-View Stereo Networks

Yao, Yao; Luo, Zixin; Li, Shiwei; Zhang, Jingyang; Ren, Yufan; Zhou, Lei; Fang, Tian; Quan, Long

doi:10.1109/cvpr42600.2020.00186

Cited by 332 publications

(206 citation statements)

References 18 publications

Supporting

Mentioning

205

Contrasting

Order By: Relevance

“…One solution is to use synthetic data for training. For instance, Yao et al created BlendedMVS [28], a synthetic dataset based on the rendered depth maps and blended images of meshes generated by existing MVS algorithms. This synthetic data is potentially enough for training MVS algorithms; however, algorithms trained on synthetic data inherently suffer from domain differences with real data.…”

Section: Related Workmentioning

confidence: 99%

Cost Volume Pyramid Based Depth Inference for Multi-View Stereo

Yang

Mao

Álvarez

et al. 2020

2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

288

166

View full text Add to dashboard Cite

show abstract

Section: Related Workmentioning

confidence: 99%

Cost Volume Pyramid Based Depth Inference for Multi-View Stereo

Yang

Mao

Álvarez

et al. 2020

2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

288

166

View full text Add to dashboard Cite

show abstract

“…There are 27097 training samples in total. The BlendedMVS dataset [43] is a large-scale dataset with indoor and outdoor scenes. Following [22,34,45], we only use this dataset for training.…”

Section: Datasetsmentioning

confidence: 99%

“…Overall Evaluation on Tanks and Temples. We train the proposed Bi-Net and GBi-Net on BlendedMVS [43], and testing on Tanks and Temples dataset. We compare our method to state-of-the-art methods.…”

Section: Benchmark Performancementioning

confidence: 99%

Generalized Binary Search Network for Highly-Efficient Multi-View Stereo

Mi¹,

Chen²,

Xu³

2021

Preprint

View full text Add to dashboard Cite

Multi-view Stereo (MVS) with known camera parameters is essentially a 1D search problem within a valid depth range. Recent deep learning-based MVS methods typically densely sample depth hypotheses in the depth range, and then construct prohibitively memory-consuming 3D cost volumes for depth prediction. Although coarse-to-fine sampling strategies alleviate this overhead issue to a certain extent, the efficiency of MVS is still an open challenge. In this work, we propose a novel method for highly efficient MVS that remarkably decreases the memory footprint, meanwhile clearly advancing state-of-the-art depth prediction performance. We investigate what a search strategy can be reasonably optimal for MVS taking into account of both efficiency and effectiveness. We first formulate MVS as a binary search problem, and accordingly propose a generalized binary search network for MVS. Specifically, in each step, the depth range is split into 2 bins with extra 1 error tolerance bin on both sides. A classification is performed to identify which bin contains the true depth. We also design three mechanisms to respectively handle classification errors, deal with out-of-range samples and decrease the training memory. The new formulation makes our method only sample a very small number of depth hypotheses in each step, which is highly memory efficient, and also greatly facilitates quick training convergence. Experiments on competitive benchmarks show that our method achieves stateof-the-art accuracy with much less memory. Particularly, our method obtains an overall score of 0.289 on DTU dataset and tops the first place on challenging Tanks and Temples advanced dataset among all the learning-based methods. The trained models and code will be released at https://github.com/MiZhenxing/GBi-Net.

show abstract

“…The datasets used in our evaluation are DTU (Aanaes et al, 2016), BlendedMVS (Yao et al, 2020), and Tanks & Temples (Knapitsch et al, 2017). Due to the simple camera trajectory of all scenes in DTU, we additionally utilize the BlendedMVS dataset with diverse camera trajectories for training.…”

Section: Datasetsmentioning

confidence: 99%

Curvature-guided dynamic scale networks for Multi-view Stereo

Truong¹,

Song²,

Jo³

2021

Preprint

View full text Add to dashboard Cite

Multi-view stereo (MVS) is a crucial task for precise 3D reconstruction. Most recent studies tried to improve the performance of matching cost volume in MVS by designing aggregated 3D cost volumes and their regularization. This paper focuses on learning a robust feature extraction network to enhance the performance of matching costs without heavy computation in the other steps. In particular, we present a dynamic scale feature extraction network, namely, CDSFNet. It is composed of multiple novel convolution layers, each of which can select a proper patch scale for each pixel guided by the normal curvature of the image surface. As a result, CDFSNet can estimate the optimal patch scales to learn discriminative features for accurate matching computation between reference and source images. By combining the robust extracted features with an appropriate cost formulation strategy, our resulting MVS architecture can estimate depth maps more precisely. Extensive experiments showed that the proposed method outperforms other state-of-the-art methods on complex outdoor scenes. It significantly improves the completeness of reconstructed models. As a result, the method can process higher resolution inputs within faster run-time and lower memory than other MVS methods. Our source code is available at https://github.com/TruongKhang/cds-mvsnet

show abstract

BlendedMVS: A Large-Scale Dataset for Generalized Multi-View Stereo Networks

Cited by 332 publications

References 18 publications

Cost Volume Pyramid Based Depth Inference for Multi-View Stereo

Cost Volume Pyramid Based Depth Inference for Multi-View Stereo

Generalized Binary Search Network for Highly-Efficient Multi-View Stereo

Curvature-guided dynamic scale networks for Multi-view Stereo

Contact Info

Product

Resources

About