Development of localized oriented receptive fields by learning a translation-invariant code for natural images

Rao, Rajesh P. N.; Ballard, Dana H.

doi:10.1088/0954-898x_9_2_005

Cited by 14 publications

(14 citation statements)

References 60 publications

(75 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…It generalizes previous approaches based on first-order Taylor series expansions of images (Black & Jepson, 1996;Rao & Ballard, 1998), which can account for only small transformations due to their assumption of a linear generative model for the transformed images. The Lie approach, on the other hand, utilizes a matrix-exponential-based generative model that can handle arbitrarily large transformations once the correct transformation operators have been learned.…”

Section: Introductionmentioning

confidence: 82%

“…Defining d dz I = G I for some operator matrix G, we can rewrite equation 2.3 as I (z) = e zG I 0 , which is the same as equation 2.2 with I 0 = I (0). Thus, some previous approaches based on first-order Taylor series expansions (Shi & Tomasi, 1994;Black & Jepson, 1996;Rao & Ballard, 1998) can be viewed as special cases of the Lie group-based generative model.…”

Section: Continuous Transformations and Lie Groupsmentioning

confidence: 99%

“…When z and G i are correctly estimated, any new transformations in the reference image I(0) are absorbed by the transformation component, keeping the original image stable. Thus, an object recognition system trained on reference images I(0) (e.g., eigen-based systems; see equation 3.2) will continue to recognize training objects despite the presence of transformations, thereby achieving perceptual invariance (see Rao & Ballard, 1998).…”

Section: Learning Lie Transformation Groupsmentioning

confidence: 99%

See 2 more Smart Citations

Learning the Lie Groups of Visual Invariance

Xu¹,

Rao

2007

Neural Computation

View full text Add to dashboard Cite

A fundamental problem in biological and machine vision is visual invariance: How are objects perceived to be the same despite transformations such as translations, rotations, and scaling? In this letter, we describe a new, unsupervised approach to learning invariances based on Lie group theory. Unlike traditional approaches that sacrifice information about transformations to achieve invariance, the Lie group approach explicitly models the effects of transformations in images. As a result, estimates of transformations are available for other purposes, such as pose estimation and visuomotor control. Previous approaches based on first-order Taylor series expansions of images can be regarded as special cases of the Lie group approach, which utilizes a matrix-exponential-based generative model of images and can handle arbitrarily large transformations. We present an unsupervised expectation-maximization algorithm for learning Lie transformation operators directly from image data containing examples of transformations. Our experimental results show that the Lie operators learned by the algorithm from an artificial data set containing six types of affine transformations closely match the analytically predicted affine operators. We then demonstrate that the algorithm can also recover novel transformation operators from natural image sequences. We conclude by showing that the learned operators can be used to both generate and estimate transformations in images, thereby providing a basis for achieving visual invariance.

show abstract

Section: Introductionmentioning

confidence: 82%

Section: Continuous Transformations and Lie Groupsmentioning

confidence: 99%

Section: Learning Lie Transformation Groupsmentioning

confidence: 99%

See 1 more Smart Citation

Learning the Lie Groups of Visual Invariance

Xu¹,

Rao

2007

Neural Computation

View full text Add to dashboard Cite

show abstract

“…A fundamental problem in vision is to simultaneously recognize objects and their transformations (Anderson & Van Essen, 1987;Olshausen et al, 1995;Rao & Ballard, 1998;Rao & Ruderman, 1999;Tenenbaum & Freeman, 2000). Bilinear generative models provide a tractable way of addressing this problem by factoring an image into object features and transformations using a bilinear function.…”

Section: Discussionmentioning

confidence: 99%

Bilinear Sparse Coding for Invariant Vision

Grimes

Rao

2005

Neural Computation

View full text Add to dashboard Cite

Recent algorithms for sparse coding and independent component analysis (ICA) have demonstrated how localized features can be learned from natural images. However, these approaches do not take image transformations into account. We describe an unsupervised algorithm for learning both localized features and their transformations directly from images using a sparse bilinear generative model. We show that from an arbitrary set of natural images, the algorithm produces oriented basis filters that can simultaneously represent features in an image and their transformations. The learned generative model can be used to translate features to different locations, thereby reducing the need to learn the same feature at multiple locations, a limitation of previous approaches to sparse coding and ICA. Our results suggest that by explicitly modeling the interaction between local image features and their transformations, the sparse bilinear approach can provide a basis for achieving transformation-invariant vision.

show abstract

“…As Schweitzer notices [33] the algorithm is likely to get stuck in local minima, since it comes from a linearization and uses gradient descent methods. On the other hand, Rao [32] has proposed a neural-network which can learn a translation-invariant code for natural images. Although he suggests updating the appearance basis, the experiments show only translation-invariant recognition, as proposed by Black and Jepson [4].…”

Section: Adding Motion Into the Subspace Formulationmentioning

confidence: 99%

Robust parameterized component analysis: theory and applications to 2D facial appearance models

Torre

Black

2003

Computer Vision and Image Understanding

View full text Add to dashboard Cite

Abstract. Principal Component Analysis (PCA) has been successfully applied to construct linear models of shape, graylevel, and motion. In particular, PCA has been widely used to model the variation in the appearance of people's faces. We extend previous work on facial modeling for tracking faces in video sequences as they undergo significant changes due to facial expressions. Here we develop person-specific facial appearance models (PSFAM), which use modular PCA to model complex intra-person appearance changes. Such models require aligned visual training data; in previous work, this has involved a time consuming and errorprone hand alignment and cropping process. Instead, we introduce parameterized component analysis to learn a subspace that is invariant to affine (or higher order) geometric transformations. The automatic learning of a PSFAM given a training image sequence is posed as a continuous optimization problem and is solved with a mixture of stochastic and deterministic techniques achieving sub-pixel accuracy. We illustrate the use of the 2D PSFAM model with several applications including video-conferencing, realistic avatar animation and eye tracking.

show abstract

Development of localized oriented receptive fields by learning a translation-invariant code for natural images

Cited by 14 publications

References 60 publications

Learning the Lie Groups of Visual Invariance

Learning the Lie Groups of Visual Invariance

Bilinear Sparse Coding for Invariant Vision

Robust parameterized component analysis: theory and applications to 2D facial appearance models

Contact Info

Product

Resources

About