Estimating a depth map from multiple views of a scene is a fundamental task in computer vision. As soon as more than two viewpoints are available, one faces the very basic question how to measure similarity across >2 image patches. Surprisingly, no direct solution exists, instead it is common to fall back to more or less robust averaging of two-view similarities. Encouraged by the success of machine learning, and in particular convolutional neural networks, we propose to learn a matching function which directly maps multiple image patches to a scalar similarity score. Experiments on several multi-view datasets demonstrate that this approach has advantages over methods based on pairwise patch similarity.
ABSTRACT:If images acquired from Unmanned Aerial Vehicles (UAVs) need to be accurately geo-referenced, the method of choice is classical aerotriangulation, since on-board sensors are usually not accurate enough for direct geo-referencing. For several different applications it has recently been proposed to mount thermal cameras on UAVs. Compared to optical images, thermal ones pose a number of challenges, in particular low resolution and weak local contrast. In this work we investigate the automatic orientation of thermal image blocks acquired from a UAV, using artificial ground control points. To that end we adapt the photogrammetric processing pipeline to thermal imagery. The pipeline achieves accuracies of about ± 1 cm in planimetry and ± 3 cm in height for the object points, respectively ± 10 cm or better for the camera positions, compared to ± 100 cm or worse for direct geo-referencing using on-board single-frequency GPS.
For some computer vision tasks, such as location recognition on mobile devices or Structure from Motion (SfM) computation from Internet photo collections, one wants to reduce a large set of images to a compact, representative subset, sometimes called "keyframes" or "skeletal set". We examine the problem of selecting a minimum set of such keyframes from the point of view of discrete optimization, as the search for a minimum connected dominating set (CDS) of the graph of pairwise connections between the database images. Even the simple minimum dominating set (DS) problem is known to be NP-hard, and the constraint that the dominating set should be connected makes it even harder. We show how the minimum DS can nevertheless be solved to global optimality efficiently in practice, by formulating it as an integer linear program (ILP). Furthermore, we show how to upgrade the solution to a connected dominating set with a second ILP if necessary, although the complete method is no longer globally optimal. We also compare the proposed method to a previous greedy heuristic. Experiments with several image sets show that the greedy solution already performs remarkably well, and that the optimal solution achieves roughly 5% smaller keyframe sets which perform equally well in location recognition and SfM tasks.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.