Image Patch Matching Using Convolutional Descriptors with Euclidean Distance

Melekhov, Iaroslav; Kannala, Juho; Rahtu, Esa

doi:10.1007/978-3-319-54526-4_46

Cited by 19 publications

(13 citation statements)

References 21 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…However, those dissimilarity metrics were not able to accurately reflect the dissimilarity distances between image patches that are subject to significant noise and irregular deformations. Various metrics have been proposed using deep‐learning methods in order to improve the accuracy of the dissimilarity measures . Such learning‐based dissimilarity measures have outperformed the conventional hand‐crafted dissimilarity measures.…”

Section: Methodsmentioning

confidence: 99%

See 1 more Smart Citation

Automatic large quantity landmark pairs detection in 4DCT lung images

Thomas

et al. 2019

Medical Physics

View full text Add to dashboard Cite

Purpose To automatically and precisely detect a large quantity of landmark pairs between two lung computed tomography (CT) images to support evaluation of deformable image registration (DIR). We expect that the generated landmark pairs will significantly augment the current lung CT benchmark datasets in both quantity and positional accuracy. Methods A large number of landmark pairs were detected within the lung between the end‐exhalation (EE) and end‐inhalation (EI) phases of the lung four‐dimensional computed tomography (4DCT) datasets. Thousands of landmarks were detected by applying the Harris‐Stephens corner detection algorithm on the probability maps of the lung vasculature tree. A parametric image registration method (pTVreg) was used to establish initial landmark correspondence by registering the images at EE and EI phases. A multi‐stream pseudo‐siamese (MSPS) network was then developed to further improve the landmark pair positional accuracy by directly predicting three‐dimensional (3D) shifts to optimally align the landmarks in EE to their counterparts in EI. Positional accuracies of the detected landmark pairs were evaluated using both digital phantoms and publicly available landmark pairs. Results Dense sets of landmark pairs were detected for 10 4DCT lung datasets, with an average of 1886 landmark pairs per case. The mean and standard deviation of target registration error (TRE) were 0.47 ± 0.45 mm with 98% of landmark pairs having a TRE smaller than 2 mm for 10 digital phantom cases. Tests using 300 manually labeled landmark pairs in 10 lung 4DCT benchmark datasets (DIRLAB) produced TRE results of 0.73 ± 0.53 mm with 97% of landmark pairs having a TRE smaller than 2 mm. Conclusion A new method was developed to automatically and precisely detect a large quantity of landmark pairs between lung CT image pairs. The detected landmark pairs could be used as benchmark datasets for more accurate and informative quantitative evaluation of DIR algorithms.

show abstract

Section: Methodsmentioning

confidence: 99%

“…Various metrics have been proposed using deep-learning methods in order to improve the accuracy of the dissimilarity measures. [35][36][37][38][39][40][41] Such learning-based dissimilarity measures have outperformed the conventional hand-crafted dissimilarity measures.…”

Section: C Automatically Locate the Corresponding Landmarks In Thementioning

confidence: 99%

Automatic large quantity landmark pairs detection in 4DCT lung images

Thomas

et al. 2019

Medical Physics

View full text Add to dashboard Cite

show abstract

“…Siamese [22][23][24]39,40] and triplet networks [27][28][29][30] are the mainstream network architectures to learn feature descriptors from raw image patches by training with large volumes of data. The Siamese network is a very popular and well-known deep neural network that uses the same weights, while working in tandem on two different input vectors, to compute comparable output vectors [41,42].…”

Section: Feature Descriptorsmentioning

confidence: 99%

“…DeepDesc [22] outputs a 128-dimensional descriptor for an image by using margin-based contrastive loss to train the Siamese network. [39] and [40] proposed similar network frameworks with DeepDesc. Furthermore, there are several variants of the Siamese network, such as L2-Net [23], DeepCD [24], etc.…”

Section: Feature Descriptorsmentioning

confidence: 99%

AE-GAN-Net: Learning Invariant Feature Descriptor to Match Ground Camera Images and a Large-Scale 3D Image-Based Point Cloud for Outdoor Augmented Reality

et al. 2019

View full text Add to dashboard Cite

Establishing the spatial relationship between 2D images captured by real cameras and 3D models of the environment (2D and 3D space) is one way to achieve the virtual–real registration for Augmented Reality (AR) in outdoor environments. In this paper, we propose to match the 2D images captured by real cameras and the rendered images from the 3D image-based point cloud to indirectly establish the spatial relationship between 2D and 3D space. We call these two kinds of images as cross-domain images, because their imaging mechanisms and nature are quite different. However, unlike real camera images, the rendered images from the 3D image-based point cloud are inevitably contaminated with image distortion, blurred resolution, and obstructions, which makes image matching with the handcrafted descriptors or existing feature learning neural networks very challenging. Thus, we first propose a novel end-to-end network, AE-GAN-Net, consisting of two AutoEncoders (AEs) with Generative Adversarial Network (GAN) embedding, to learn invariant feature descriptors for cross-domain image matching. Second, a domain-consistent loss function, which balances image content and consistency of feature descriptors for cross-domain image pairs, is introduced to optimize AE-GAN-Net. AE-GAN-Net effectively captures domain-specific information, which is embedded into the learned feature descriptors, thus making the learned feature descriptors robust against image distortion, variations in viewpoints, spatial resolutions, rotation, and scaling. Experimental results show that AE-GAN-Net achieves state-of-the-art performance for image patch retrieval with the cross-domain image patch dataset, which is built from real camera images and the rendered images from 3D image-based point cloud. Finally, by evaluating virtual–real registration for AR on a campus by using the cross-domain image matching results, we demonstrate the feasibility of applying the proposed virtual–real registration to AR in outdoor environments.

show abstract

“…Neural networks have been widely used to learn discriminative and robust descriptors [3,21]. Those descriptors are then compared pair-wise by thresholding Euclidean distance between them [6,9,22] or by predicting a binary label [37,38]. In contrast, the proposed approach processes the image as a whole, and thus, it can handle a broader set of geometric changes in images and directly predict dense correspondences without any post-processing steps.…”

Section: Introductionmentioning

confidence: 99%

DGC-Net: Dense Geometric Correspondence Network

Melekhov

Tiulpin

Sattler

et al. 2019

2019 IEEE Winter Conference on Applications of Computer Vision (WACV)

Self Cite

118

192

View full text Add to dashboard Cite

This paper addresses the challenge of dense pixel correspondence estimation between two images. This problem is closely related to optical flow estimation task where Con-vNets (CNNs) have recently achieved significant progress. While optical flow methods produce very accurate results for the small pixel translation and limited appearance variation scenarios, they hardly deal with the strong geometric transformations that we consider in this work. In this paper, we propose a coarse-to-fine CNN-based framework that can leverage the advantages of optical flow approaches and extend them to the case of large transformations providing dense and subpixel accurate estimates. It is trained on synthetic transformations and demonstrates very good performance to unseen, realistic, data. Further, we apply our method to the problem of relative camera pose estimation and demonstrate that the model outperforms existing dense approaches.

show abstract

Image Patch Matching Using Convolutional Descriptors with Euclidean Distance

Cited by 19 publications

References 21 publications

Automatic large quantity landmark pairs detection in 4DCT lung images

Automatic large quantity landmark pairs detection in 4DCT lung images

AE-GAN-Net: Learning Invariant Feature Descriptor to Match Ground Camera Images and a Large-Scale 3D Image-Based Point Cloud for Outdoor Augmented Reality

DGC-Net: Dense Geometric Correspondence Network

Contact Info

Product

Resources

About