2007
DOI: 10.1007/s11263-007-0107-3
|View full text |Cite
|
Sign up to set email alerts
|

Modeling the World from Internet Photo Collections

Abstract: There are billions of photographs on the Internet, comprising the largest and most diverse photo collection ever assembled. How can computer vision researchers exploit this imagery? This paper explores this question from the standpoint of 3D scene modeling and visualization. We present structure-from-motion and image-based rendering algorithms that operate on hundreds of images downloaded as a result of keyword-based image search queries like "Notre Dame" or "Trevi Fountain." This approach, which we call Photo… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
1,248
0
26

Year Published

2009
2009
2018
2018

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 1,901 publications
(1,274 citation statements)
references
References 81 publications
0
1,248
0
26
Order By: Relevance
“…However, the sequential steps that were followed in this study are outlined below and are similar across all commercial and open-source software packages. Further details on SfM can be found in Fonstad et al (2013), Lowe (2004), Snavely et al (2008), Szeliski (2010) and Verhoeven (2011).…”
Section: Structure From Motion (Sfm) Workflow -Dem and Orthomosaic Gementioning
confidence: 99%
“…However, the sequential steps that were followed in this study are outlined below and are similar across all commercial and open-source software packages. Further details on SfM can be found in Fonstad et al (2013), Lowe (2004), Snavely et al (2008), Szeliski (2010) and Verhoeven (2011).…”
Section: Structure From Motion (Sfm) Workflow -Dem and Orthomosaic Gementioning
confidence: 99%
“…[1][2][3][4]. Bundle adjustment in general has O(N 3 ) complexity, where N is the number of variables in the problem [5].…”
Section: Introductionmentioning
confidence: 99%
“…1 The motivation behind this is the following: Let y v (x) ∈ R be the disparity value at pixel x. This disparity value can be equivalently encoded by plane parameters u v (x) ∈ R 3 , since we can write y v (x) = p(x) T u v (x), where p(x) = (x T , 1) T is the homogeneous coordinate representation of x.…”
Section: A Visible Layer For Semantics-aware Depth Completionmentioning
confidence: 99%
“…While impressive results can be achieved with multi-view and videobased approaches [1][2][3][4], the progress of depth sensors and their decreasing prices make them an attractive alternative, able to capture 3D in a single shot [5]. Unfortunately, even the best depth sensors still provide imperfect measurements.…”
Section: Introductionmentioning
confidence: 99%