Convolutional Neural Networks Learn Compact Local Image Descriptors

Osendorfer, Christian; Bayer, Justin; Urban, Sebastian; Smagt, Patrick van der

doi:10.1007/978-3-642-42051-1_77

Cited by 17 publications

(17 citation statements)

References 11 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Descriptor learning using CNNs was addressed early in [11,19], but the experimental results in these works left open questions regarding several practical aspects, such as the most appropriate network architectures and applicationdependent training schemes. More recently, the use of Siamese networks for descriptor learning was exploited by concurrent works on joint descriptor and metric learning [10,33,34].…”

Section: Related Workmentioning

confidence: 99%

Discriminative Learning of Deep Convolutional Feature Point Descriptors

Simo-Serra

Trulls

Ferraz³

et al. 2015

2015 IEEE International Conference on Computer Vision (ICCV)

738

650

View full text Add to dashboard Cite

Deep learning has revolutionalized image-level tasks such as classification, but patch-level tasks, such as correspondence, still rely on hand-crafted features, e.g. SIFT. In this paper we use Convolutional Neural Networks (CNNs) to learn discriminant patch representations and in particular train a Siamese network with pairs of (non-)corresponding patches. We deal with the large number of potential pairs with the combination of a stochastic sampling of the training set and an aggressive mining strategy biased towards patches that are hard to classify.By using the L 2 distance during both training and testing we develop 128-D descriptors whose euclidean distances reflect patch similarity, and which can be used as a drop-in replacement for any task involving SIFT. We demonstrate consistent performance gains over the state of the art, and generalize well against scaling and rotation, perspective transformation, non-rigid deformation, and illumination changes. Our descriptors are efficient to compute and amenable to modern GPUs, and are publicly available.

show abstract

Section: Related Workmentioning

confidence: 99%

Discriminative Learning of Deep Convolutional Feature Point Descriptors

Simo-Serra

Trulls

Ferraz³

et al. 2015

2015 IEEE International Conference on Computer Vision (ICCV)

738

650

View full text Add to dashboard Cite

show abstract

“…Our method exceeds the best descriptor variant in (Trzcinski et al, 2015), namely FPBoost512-{64}, in terms of error rate at 95% recall in all training and test data combinations and a performance improvement of nearly 7.1% is achieved. To the best of our knowledge, Osendorfer et al (2013) published the best results for a method for descriptor learning based on Siamese CNN architecture without classifier so far; it is the method most similar to ours in our comparison. Compared to this method, we achieved a performance improvement of 3.5%.…”

Section: Results and Evaluationmentioning

confidence: 62%

“…For method TRC (Trzcinski et al, 2015), we chose their best performing descriptor variant for our comparison, which is the floating point version with 64 bits. In method OS (Osendorfer et al, 2013), a descriptor learning architecture based on a Siamese CNN similar to our work was used, but the authors concentrated more on the comparison of different forms of loss functions and their model is trained by standard gradient descent. Finally, SIFT (Lowe, 2004) is used as a general baseline for the descriptor matching, because it is widely acknowledged as a good descriptor in a feature engineering manner.…”

Section: Results and Evaluationmentioning

confidence: 99%

“…Jahrer et al (2008) used the Siamese CNN to train the descriptor and compare the patches, but the training data was generated from image warps and dependent on input images, which makes this method less practical, because it always needs a prior simulation and training before image matching. In (Osendorfer et al, 2013), a Siamese CNN is used to train a descriptor; the paper focuses on the comparison of four different types of loss functions. More recently, the Siamese architecture was used to train patch descriptors to cope with dynamic lighting conditions (Carlevaris-Bianco and Eustice, 2014), feeding patches with severe illumination change into a Siamese CNN; illumination invariance that exceeds any hand-crafted descriptors is achieved.…”

Section: Related Workmentioning

confidence: 99%

“…For matching pairs, a distance larger than a "pull radius" l pull is penalised, whereas for non-matching pairs (the negative training examples), penalisation occurs for distances smaller than a "push radius" l push . This type of loss function has been shown to be suitable for descriptor learning by Osendorfer et al (2013). The two radii are parameters that have to be set by the user.…”

Section: Siamese Descriptor Learning: Architecturementioning

confidence: 99%

See 2 more Smart Citations

Invariant Descriptor Learning Using a Siamese Convolutional Neural Network

Chen¹,

Rottensteiner²,

Heipke³

2016

ISPRS Ann. Photogramm. Remote Sens. Spatial Inf. Sci.

View full text Add to dashboard Cite

ABSTRACT:In this paper we describe learning of a descriptor based on the Siamese Convolutional Neural Network (CNN) architecture and evaluate our results on a standard patch comparison dataset. The descriptor learning architecture is composed of an input module, a Siamese CNN descriptor module and a cost computation module that is based on the L2 Norm. The cost function we use pulls the descriptors of matching patches close to each other in feature space while pushing the descriptors for non-matching pairs away from each other. Compared to related work, we optimize the training parameters by combining a moving average strategy for gradients and Nesterov's Accelerated Gradient. Experiments show that our learned descriptor reaches a good performance and achieves stateof-art results in terms of the false positive rate at a 95% recall rate on standard benchmark datasets.

show abstract

Image Patch Matching Using Convolutional Descriptors with Euclidean Distance

Melekhov

Kannala

Rahtu

2017

Computer Vision – ACCV 2016 Workshops

View full text Add to dashboard Cite

In this work we propose a neural network based image descriptor suitable for image patch matching, which is an important task in many computer vision applications. Our approach is influenced by recent success of deep convolutional neural networks (CNNs) in object detection and classification tasks. We develop a model which maps the raw input patch to a low dimensional feature vector so that the distance between representations is small for similar patches and large otherwise. As a distance metric we utilize L2 norm, i.e. Euclidean distance, which is fast to evaluate and used in most popular hand-crafted descriptors, such as SIFT. According to the results, our approach outperforms state-of-the-art L2-based descriptors and can be considered as a direct replacement of SIFT. In addition, we conducted experiments with batch normalization and histogram equalization as a preprocessing method of the input data. The results confirm that these techniques further improve the performance of the proposed descriptor. Finally, we show promising preliminary results by appending our CNNs with recently proposed spatial transformer networks and provide a visualisation and interpretation of their impact.

show abstract

Convolutional Neural Networks Learn Compact Local Image Descriptors

Cited by 17 publications

References 11 publications

Discriminative Learning of Deep Convolutional Feature Point Descriptors

Discriminative Learning of Deep Convolutional Feature Point Descriptors

Invariant Descriptor Learning Using a Siamese Convolutional Neural Network

Image Patch Matching Using Convolutional Descriptors with Euclidean Distance

Contact Info

Product

Resources

About