A Compositional Model for Low-Dimensional Image Set Representation

Mobahi, Hossein; Liu, Ce; Freeman, William T.

doi:10.1109/cvpr.2014.172

Cited by 21 publications

(17 citation statements)

References 25 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…affine). RASL [31], Collection Flow [21] and Mobahi et al [29] first estimate a low-rank subspace of the image collection, and then perform joint alignment among images projected onto the subspace. FlowWeb [40] builds a fully-connected graph for the image collection with images as nodes and pairwise flow fields as edges, and establishes globally-consistent dense correspondences by maximizing the cycle consistency among all edges.…”

Section: Related Workmentioning

confidence: 99%

Learning Dense Correspondence via 3D-Guided Cycle Consistency

Zhou

Krähenbühl

Aubry

et al. 2016

2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

348

295

View full text Add to dashboard Cite

Discriminative deep learning approaches have shown impressive results for problems where human-labeled ground truth is plentiful, but what about tasks where labels are difficult or impossible to obtain? This paper tackles one such problem: establishing dense visual correspondence across different object instances. For this task, although we do not know what the ground-truth is, we know it should be consistent across instances of that category. We exploit this consistency as a supervisory signal to train a convolutional neural network to predict cross-instance correspondences between pairs of images depicting objects of the same category. For each pair of training images we find an appropriate 3D CAD model and render two synthetic views to link in with the pair, establishing a correspondence flow 4-cycle. We use ground-truth synthetic-to-synthetic correspondences, provided by the rendering engine, to train a ConvNet to predict synthetic-to-real, real-to-real and realto-synthetic correspondences that are cycle-consistent with the ground-truth. At test time, no CAD models are required. We demonstrate that our end-to-end trained Con-vNet supervised by cycle-consistency outperforms stateof-the-art pairwise matching methods in correspondencerelated tasks.

show abstract

Section: Related Workmentioning

confidence: 99%

Learning Dense Correspondence via 3D-Guided Cycle Consistency

Zhou

Krähenbühl

Aubry

et al. 2016

2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

348

295

View full text Add to dashboard Cite

show abstract

“…In addition, Learned-Miller [21] generalises the dense correspondences between image pairs to an arbitrary number of images by continuously warping each image via a parametric transformation. RSA [37], Collection Flow [18] and Mobahi et al [29] project a collection of images into a lower dimensional subspace and perform a joint alignment among the projected images. AnchorNet [34] learns semantically meaningful parts across categories, although is trained with image labels.…”

Section: Related Workmentioning

confidence: 99%

Unsupervised Learning of Landmarks by Descriptor Vector Exchange

Thewlis¹,

Albanie

Bilen

et al. 2019

2019 IEEE/CVF International Conference on Computer Vision (ICCV)

View full text Add to dashboard Cite

Figure 1: We propose Descriptor Vector Exchange (DVE), a mechanism that enables unsupervised learning of robust highdimensional dense embeddings with equivariance losses. The embeddings learned for the category of faces are visualised in the figure above with the help of a query image [8], shown in the centre of the figure. (Left): We colour the locations of pixel embeddings that form the nearest neighbours of the query reference points. (Right): The same reference points are used to retrieve patches amongst a collection of face images. The result is an approximate face mosaic, matching parts across different identities despite the fact that no landmark annotations of any kind were used during learning. AbstractEquivariance to random image transformations is an effective method to learn landmarks of object categories, such as the eyes and the nose in faces, without manual supervision. However, this method does not explicitly guarantee that the learned landmarks are consistent with changes between different instances of the same object, such as different facial identities. In this paper, we develop a new perspective on the equivariance approach by noting that dense landmark detectors can be interpreted as local image descriptors equipped with invariance to intra-category variations. We then propose a direct method to enforce such an invariance in the standard equivariant loss. We do so by exchanging descriptor vectors between images of different object instances prior to matching them geometrically. In this manner, the same vectors must work regardless of the specific object identity considered. We use this approach to learn vectors that can simultaneously be interpreted as local descriptors and dense landmarks, combining the advan- * Equal Contribution. James was with the VGG during part of this work. tages of both. Experiments on standard benchmarks show that this approach can match, and in some cases surpass state-of-the-art performance amongst existing methods that learn landmarks without supervision. Code is available at

show abstract

“…This parameterization allows us to describe a wide variety of motion (or warp) fields. Generic "smooth" warp fields can be described by using a truncated Discrete Cosine Transform (DCT) basis as M. However, more compressed motion bases can also be used [42], [43]. In this work, results are shown using an affine transformation parametrized with a fourdimensional θ that captures rotation, shear, and scaling.…”

Section: Dynamic Modelmentioning

confidence: 99%

Reconstructing Video of Time-Varying Sources From Radio Interferometric Measurements

Bouman

Johnson

Dalca

et al. 2018

IEEE Trans. Comput. Imaging

Self Cite

View full text Add to dashboard Cite

Very long baseline interferometry (VLBI) makes it possible to recover images of astronomical sources with extremely high angular resolution. Most recently, the Event Horizon Telescope (EHT) has extended VLBI to short millimeter wavelengths with a goal of achieving angular resolution sufficient for imaging the event horizons of nearby supermassive black holes. VLBI provides measurements related to the underlying source image through a sparse set spatial frequencies. An image can then be recovered from these measurements by making assumptions about the underlying image. One of the most important assumptions made by conventional imaging methods is that over the course of a night's observation the image is static. However, for quickly evolving sources, such as the galactic center's supermassive black hole (SgrA*) targeted by the EHT, this assumption is violated and these conventional imaging approaches fail. In this work we propose a new way to model VLBI measurements that allows us to recover both the appearance and dynamics of an evolving source by reconstructing a video rather than a static image. By modeling VLBI measurements using a Gaussian Markov Model, we are able to propagate information across observations in time to reconstruct a video, while simultaneously learning about the dynamics of the source's emission region. We demonstrate our proposed Expectation-Maximization (EM) algorithm, StarWarps, on realistic synthetic observations of black holes, and show how it substantially improves results compared to conventional imaging algorithms. Additionally, we demonstrate StarWarps on real VLBI data of the M87 Jet from the VLBA.

show abstract

A Compositional Model for Low-Dimensional Image Set Representation

Cited by 21 publications

References 25 publications

Learning Dense Correspondence via 3D-Guided Cycle Consistency

Learning Dense Correspondence via 3D-Guided Cycle Consistency

Unsupervised Learning of Landmarks by Descriptor Vector Exchange

Reconstructing Video of Time-Varying Sources From Radio Interferometric Measurements

Contact Info

Product

Resources

About