2012
DOI: 10.1007/978-3-642-33712-3_46
|View full text |Cite
|
Sign up to set email alerts
|

Jet-Based Local Image Descriptors

Abstract: Abstract. We present a general novel image descriptor based on higherorder differential geometry and investigate the effect of common descriptor choices. Our investigation is twofold in that we develop a jet-based descriptor and perform a comparative evaluation with current state-ofthe-art descriptors on the recently released DTU Robot dataset. We demonstrate how the use of higher-order image structures enables us to reduce the descriptor dimensionality while still achieving very good performance. The descript… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
20
0

Year Published

2013
2013
2021
2021

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 21 publications
(20 citation statements)
references
References 20 publications
0
20
0
Order By: Relevance
“…Given the spatial Gaussian scale-space concept [24,34,44,46,47,59,60,67,70,106,111,120,123], a general methodology for spatial scale selection has been developed based on local extrema over spatial scales of scale-normalized differential entities [62,64,65,72,73]. This general method- 2 The spatial Laplacian applied to the first-and second-order temporal derivatives ∇ 2 (x,y) L t and ∇ 2 (x,y) L tt as well as the spatio-temporal Laplacian ∇ 2 (x,y,t) L computed from a video sequence in the UCF-101 dataset (Kayaking_g01_c01.avi) at 3 × 3 combinations of the spatial scales (bottom row) σ s,1 = 2 pixels, (middle row) σ s,2 = 4.6 pixels and (top row) σ s,3 = 10.6 pixels and the temporal scales (left column) σ τ,1 = 40 ms, (middle column) σ τ,2 = 160 ms and (right column) σ τ,3 = 640 ms with the spatial and temporal scale parameters in units of σ s = √ s and σ τ = √ τ and using a time-causal spatio-temporal scale-space representation with a logarithmic distribution of the temporal scale levels for c = 2 (image size: 320 × 172 pixels of original 320 × 240 pixels; frame 90 of 226 frames at 25 framesframes/s) ology has in turn been successfully applied to develop robust methods for image-based matching and recognition [5,41,52,68,74,84,86,87,89,90,[112][113][114] that are able to handle large variations of the size of the objects in the image domain and with numerous applications regarding object recognition, object categorization, multi-view geometry, construction of 3-D models from visual input,…”
Section: Figmentioning
confidence: 99%
“…Given the spatial Gaussian scale-space concept [24,34,44,46,47,59,60,67,70,106,111,120,123], a general methodology for spatial scale selection has been developed based on local extrema over spatial scales of scale-normalized differential entities [62,64,65,72,73]. This general method- 2 The spatial Laplacian applied to the first-and second-order temporal derivatives ∇ 2 (x,y) L t and ∇ 2 (x,y) L tt as well as the spatio-temporal Laplacian ∇ 2 (x,y,t) L computed from a video sequence in the UCF-101 dataset (Kayaking_g01_c01.avi) at 3 × 3 combinations of the spatial scales (bottom row) σ s,1 = 2 pixels, (middle row) σ s,2 = 4.6 pixels and (top row) σ s,3 = 10.6 pixels and the temporal scales (left column) σ τ,1 = 40 ms, (middle column) σ τ,2 = 160 ms and (right column) σ τ,3 = 640 ms with the spatial and temporal scale parameters in units of σ s = √ s and σ τ = √ τ and using a time-causal spatio-temporal scale-space representation with a logarithmic distribution of the temporal scale levels for c = 2 (image size: 320 × 172 pixels of original 320 × 240 pixels; frame 90 of 226 frames at 25 framesframes/s) ology has in turn been successfully applied to develop robust methods for image-based matching and recognition [5,41,52,68,74,84,86,87,89,90,[112][113][114] that are able to handle large variations of the size of the objects in the image domain and with numerous applications regarding object recognition, object categorization, multi-view geometry, construction of 3-D models from visual input,…”
Section: Figmentioning
confidence: 99%
“…Dense local approaches have been investigated by Jurie and Triggs [62], Lazebnik et al [84], Bosch et al [18], Agarwal and Triggs [2] and Tola et al [150]. More recently, Larsen et al [81] made use of multi-local N-jet descriptors that do not rely on a spatial statistics of receptive field responses as used in the SIFT and SURF descriptors or their analogues. A notable observation from experimental results is that very good performance can be obtained with coarsely quantized even binary image descriptors (Pietikäinen et al [131], Linde and Lindeberg [88], Calonder et al [26]).…”
Section: Related Workmentioning
confidence: 99%
“…[7]) have been demonstrated to be highly useful for this purpose with many successful applications, including multi-view image matching, object recognition, 3-D object and scene modelling, video tracking, gesture recognition, panorama stitching as well as robot localization and mapping. Different generalizations of the SIFT operator in terms of the image descriptor have been presented by Ke and Sukthankar [66], Mikolajczyk and Schmid [125], Burghouts and Geusebroek [24], Toews and Wells [149], van de Sande et al [138], Tola et al [150] and Larsen et al [81].…”
mentioning
confidence: 99%
“…Unfortunately, this approach can only handle moderate changes in illumination, and failed on our test sequences. While they have never been used for direct image alignment -to the best of our knowledge-it seems interesting to use "local jets" for the d function [4,22,10,11]. Local jets are vectors often used as local descriptors and efficiently computed by convolving an image with a series of filters:…”
Section: Descriptor Fieldsmentioning
confidence: 99%