From information scaling of natural images to regimes of statistical models

Wu, Ying; Guo, Cheng-en; Zhu, Song‐Chun

doi:10.1090/s0033-569x-07-01063-2

Cited by 42 publications

(35 citation statements)

References 73 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…However, the rendition of the "primal sketch" [54] in [35] does not guarantee that the construction is "lossless" with respect to any particular task, because there is no underlying task guiding the construction. Our work also relates to the vast literature on segmentation, particularly texture-structure transitions [83]. Alternative approaches to this task could be specified in terms of sparse coding [59] and non-local filtering [15].…”

Section: Summary and Discussionmentioning

confidence: 99%

Actionable Information in Vision

Soatto

2013

Machine Learning for Computer Vision

View full text Add to dashboard Cite

Summary. A notion of visual information is introduced as the complexity not of the raw images, but of the images after the effects of nuisance factors such as viewpoint and illumination are discounted. It is rooted in ideas of J. J. Gibson, and stands in contrast to traditional information as entropy or coding length of the data regardless of its use, and regardless of the nuisance factors affecting it. The non-invertibility of nuisances such as occlusion and quantization induces an "information gap" that can only be bridged by controlling the data acquisition process. Measuring visual information entails early vision operations, tailored to the structure of the nuisances so as to be "lossless" with respect to visual decision and control tasks (as opposed to data transmission and storage tasks implicit in communications theory). The definition of visual information suggests desirable properties that a visual representation should possess to best accomplish vision-based decision and control tasks.

show abstract

Section: Summary and Discussionmentioning

confidence: 99%

Actionable Information in Vision

Soatto

2013

Machine Learning for Computer Vision

View full text Add to dashboard Cite

show abstract

“…In object recognition, sketch features are shown to work well on objects with regular shapes, while texture features are more suitable for complex objects with cluttered appearance. These two types of features are often studied separately for structures at different resolutions, but in real images, they are connected continuously through image scaling [8]. That is, viewed in low resolution, geometric structures become blurred and merge into texture appearance, and can become flat area (white noise) at extremely low resolution.…”

Section: Local Feature Descriptorsmentioning

confidence: 99%

Learning Hybrid Image Templates (HIT) by Information Projection

Zhu

2012

IEEE Trans. Pattern Anal. Mach. Intell.

Self Cite

View full text Add to dashboard Cite

Abstract-This paper presents a novel framework for learning a generative image representation -the hybrid image template (HIT) from a small number (i.e, 3 ∼ 20) of image examples. Each learned template is composed of, typically, 50 ∼ 500 image patches whose geometric attributes (location, scale, orientation) may adapt in a local neighborhood for deformation, and whose appearances are characterized respectively by four types of descriptors: local sketch (edge or bar), texture gradients with orientations, flatness regions, and colors. These heterogeneous patches are automatically ranked and selected from a large pool according to their information gains using an information projection framework. Intuitively, a patch has a higher information gain if (i) its feature statistics is consistent within the training examples and is distinctive from the statistics of negative examples (i.e. generic images or examples from other categories); and (ii) its feature statistics has less intra-class variations. The learning process pursues the most informative (for either generative or discriminative purpose) patches one at a time and stops when the information gain is within statistical fluctuation. The template is associated with a well-normalized probability model that integrates the heterogeneous feature statistics. This automated feature selection procedure allows our algorithm to scale up to a wide range of image categories, from those with regular shapes to those with stochastic texture. The learned representation captures the intrinsic characteristics of the object or scene categories. We evaluate the hybrid image templates on several public benchmarks, and demonstrate classification performances on par with state-of-art methods like HoG+SVM, and when small training sample sizes are used the proposed system shows a clear advantage.

show abstract

“…First, as illustrated by Figure 19, as we zoom out the images, the image patterns undergo a transition from low-entropy regime of geometric structures to mid-entropy regime of object shapes to high-entropy regime of stochastic textures [36] (patterns such as the brick wall are also textures, but they for the distribution of the nominal template (before shape deformation), where λ x,s,α can be either positive (for sketch patterns) or negative (for flatness patterns). One can sparsify λ x,s,α by 1 penalized maximum likelihood.…”

Section: Contributions and Limitationsmentioning

confidence: 99%

“…Information theoretical interpretation. The exponential family model can be justified by the maximum entropy principle [16,28,36,43]. Given the deformed template…”

mentioning

confidence: 99%

Unsupervised learning of compositional sparse code for natural image representation

Hong

et al. 2013

Quart. Appl. Math.

Self Cite

View full text Add to dashboard Cite

Abstract. This article proposes an unsupervised method for learning compositional sparse code for representing natural images. Our method is built upon the original sparse coding framework where there is a dictionary of basis functions often in the form of localized, elongated and oriented wavelets, so that each image can be represented by a linear combination of a small number of basis functions automatically selected from the dictionary. In our compositional sparse code, the representational units are composite: they are compositional patterns formed by the basis functions. These compositional patterns can be viewed as shape templates. We propose an unsupervised learning method for learning a dictionary of frequently occurring templates from training images, so that each training image can be represented by a small number of templates automatically selected from the learned dictionary. The compositional sparse code approximates the raw image of a large number of pixel intensities using a small number of templates, thus facilitating the signal-to-symbol transition and allowing a symbolic description of the image. The current form of our model consists of two layers of representational units (basis functions and shape templates). It is possible to extend it to multiple layers of hierarchy. Experiments show that our method is capable of learning meaningful compositional sparse code, and the learned templates are useful for image classification.

show abstract

From information scaling of natural images to regimes of statistical models

Cited by 42 publications

References 73 publications

Actionable Information in Vision

Actionable Information in Vision

Learning Hybrid Image Templates (HIT) by Information Projection

Unsupervised learning of compositional sparse code for natural image representation

Contact Info

Product

Resources

About