“…Given the spatial Gaussian scale-space concept [24,34,44,46,47,59,60,67,70,106,111,120,123], a general methodology for spatial scale selection has been developed based on local extrema over spatial scales of scale-normalized differential entities [62,64,65,72,73]. This general method- 2 The spatial Laplacian applied to the first-and second-order temporal derivatives ∇ 2 (x,y) L t and ∇ 2 (x,y) L tt as well as the spatio-temporal Laplacian ∇ 2 (x,y,t) L computed from a video sequence in the UCF-101 dataset (Kayaking_g01_c01.avi) at 3 × 3 combinations of the spatial scales (bottom row) σ s,1 = 2 pixels, (middle row) σ s,2 = 4.6 pixels and (top row) σ s,3 = 10.6 pixels and the temporal scales (left column) σ τ,1 = 40 ms, (middle column) σ τ,2 = 160 ms and (right column) σ τ,3 = 640 ms with the spatial and temporal scale parameters in units of σ s = √ s and σ τ = √ τ and using a time-causal spatio-temporal scale-space representation with a logarithmic distribution of the temporal scale levels for c = 2 (image size: 320 × 172 pixels of original 320 × 240 pixels; frame 90 of 226 frames at 25 framesframes/s) ology has in turn been successfully applied to develop robust methods for image-based matching and recognition [5,41,52,68,74,84,86,87,89,90,[112][113][114] that are able to handle large variations of the size of the objects in the image domain and with numerous applications regarding object recognition, object categorization, multi-view geometry, construction of 3-D models from visual input,…”