This paper presents a new approach for the challenging problem of geo-locating an image using image matching in a structured database of city-wide reference images with known GPS coordinates. We cast the geo-localization as a clustering problem on local image features. Akin to existing approaches on the problem, our framework builds on low-level features which allow partial matching between images. For each local feature in the query image, we find its approximate nearest neighbors in the reference set. Next, we cluster the features from reference images using Dominant Set clustering, which affords several advantages over existing approaches. First, it permits variable number of nodes in the cluster which we use to dynamically select the number of nearest neighbors (typically coming from multiple reference images) for each query feature based on its discrimination value. Second, as we also quantify in our experiments, this approach is several orders of magnitude faster than existing approaches. Thus, we obtain multiple clusters (different local maximizers) and obtain a robust final solution to the problem using multiple weak solutions through constrained Dominant Set clustering on global image features, where we enforce the constraint that the query image must be included in the cluster. This second level of clustering also bypasses heuristic approaches to voting and selecting the reference image that matches to the query. We evaluate the proposed framework on an existing dataset of 102k street view images as well as a new larger dataset of 300k images, and show that it outperforms the state-of-the-art by 20% and 7%, respectively, on the two datasets.
In this paper, a unified three-layer hierarchical approach for solving tracking problems in multiple non-overlapping cameras is proposed. Given a video and a set of detections (obtained by any person detector), we first solve within-camera tracking employing the first two layers of our framework and, then, in the third layer, we solve across-camera tracking by merging tracks of the same person in all cameras in a simultaneous fashion. To best serve our purpose, a constrained dominant sets clustering (CDSC) technique, a parametrized version of standard quadratic optimization, is employed to solve both tracking tasks. The tracking problem is caste as finding constrained dominant sets from a graph. That is, given a constraint set and a graph, CDSC generates cluster (or clique), which forms a compact and coherent set that contains a subset of the constraint set. The approach is based on a parametrized family of quadratic programs that generalizes the standard quadratic optimization problem. In addition to having a unified framework that simultaneously solves within-and across-camera tracking, the third layer helps link broken tracks of the same person occurring during within-camera tracking. A standard algorithm to extract constrained dominant set from a graph is given by the so-called replicator dynamics whose computational complexity is quadratic per step which makes it handicapped for large-scale applications. In this work, we propose a fast algorithm, based on dynamics from evolutionary game theory, which is efficient and salable to large-scale real-world applications. We have tested this approach on a very large and challenging dataset (namely, MOTchallenge DukeMTMC) and show that the proposed framework outperforms the current state of the art. Even though the main focus of this paper is on multi-target tracking in non-overlapping cameras, proposed approach can also be applied to solve re-identification problem. Towards that end, we also have performed experiments on MARS, one of the largest and challenging video-based person re-identification dataset, and have obtained excellent results. These experiments demonstrate the general applicability of the proposed framework for non-overlapping across-camera tracking and person re-identification tasks.
Multi‐object tracking is an interesting but challenging task in the field of computer vision. Most previous works based on data association techniques merely take into account the relationship between detection responses in a locally limited temporal domain, which makes them inherently prone to identity switches and difficulties in handling long‐term occlusions. In this study, a dominant set clustering based tracker is proposed, which formulates the tracking task as a problem of finding dominant sets in an auxiliary edge weighted graph. Unlike most techniques which are limited in temporal locality (i.e. few frames are considered), the authors utilised a pairwise relationships (in appearance and position) between different detections across the whole temporal span of the video for data association in a global manner. Meanwhile, temporal sliding window technique is utilised to find tracklets and perform further merging on them. The authors’ robust tracklet merging step renders the tracker to long term occlusions with more robustness. The authors present results on three different challenging datasets (i.e. PETS2009‐S2L1, TUD‐standemitte and ETH dataset (‘sunny day’ sequence)), and show significant improvements compared with several state‐of‐art methods.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.