Nonlinear principal component analysis—Based on principal curves and neural networks

Dong, Diya; McAvoy, Thomas J.

doi:10.1016/0098-1354(95)00003-k

Cited by 578 publications

(246 citation statements)

References 27 publications

Supporting

Mentioning

240

Contrasting

Unclassified

Order By: Relevance

“…This enables the use of non-linear generative models, such as the Helmholtz machine for binary stochastic systems and non-linear PCA for parametric deterministic models (e.g. Dong & McAvoy, 1996;Friston et al, 2000;Karhunen & Joutsensalo, 1994;Kramer, 1991;Taleb & Jutten, 1997). The latter schemes typically employ a 'bottleneck' architecture that forces the inputs through a small number of nodes.…”

Section: Non-invertible Modelsmentioning

confidence: 99%

Learning and inference in the brain

2003

View full text Add to dashboard Cite

Section: Non-invertible Modelsmentioning

confidence: 99%

Learning and inference in the brain

2003

View full text Add to dashboard Cite

“…This enables the use of nonlinear generative models, such as nonlinear PCA (e.g. Kramer, 1991;Karhunen and Joutsensalo, 1994;Dong and McAvoy, 1996;Taleb and Jutten, 1997). These schemes typically employ a 'bottleneck' architecture that forces the inputs through a small number of nodes.…”

Section: Information Theorymentioning

confidence: 99%

Functional integration and inference in the brain

Friston

2002

Progress in Neurobiology

274

201

View full text Add to dashboard Cite

Self-supervised models of how the brain represents and categorises the causes of its sensory input can be divided into two classes: those that minimise the mutual information (i.e. redundancy) among evoked responses and those that minimise the prediction error. Although these models have similar goals, the way they are attained, and the functional architectures employed, can be fundamentally different. This review describes the two classes of models and their implications for the functional anatomy of sensory cortical hierarchies in the brain. We then consider how empirical evidence can be used to disambiguate between architectures that are sufficient for perceptual learning and synthesis.Most models of representational learning require prior assumptions about the distribution of sensory causes. Using the notion of empirical Bayes, we show that these assumptions are not necessary and that priors can be learned in a hierarchical context. Furthermore, we try to show that learning can be implemented in a biologically plausible way. The main point made in this review is that backward connections, mediating internal or generative models of how sensory inputs are caused, are essential if the process generating inputs cannot be inverted. Because these processes are dynamical in nature, sensory inputs correspond to a non-invertible nonlinear convolution of causes. This enforces an explicit parameterisation of generative models (i.e. backward connections) to enable approximate recognition and suggests that feedforward architectures, on their own, are not sufficient. Moreover, nonlinearities in generative models, that induce a dependence on backward connections, require these connections to be modulatory; so that estimated causes in higher cortical levels can interact to predict responses in lower levels. This is important in relation to functional asymmetries in forward and backward connections that have been demonstrated empirically.To ascertain whether backward influences are expressed functionally requires measurements of functional integration among brain systems. This review summarises approaches to integration in terms of effective connectivity and proceeds to address the question posed by the theoretical considerations above. In short, it will be shown that functional neuroimaging can be used to test for interactions between bottom-up and top-down inputs to an area. The conclusion of these studies points toward the prevalence of top-down influences and the plausibility of generative models of sensory brain function.

show abstract

“…Dong and McAvoy [16] proposed another approach to simplify the structure of the original complex 5 layer structure by Kramer. This work relies on a separation of the 5 layer network into the 3 layer mapping function G (·) and another 3 layer network representing the demapping function H (·).…”

Section: Neural Network Approachesmentioning

confidence: 99%

“…Thus, only a 3 layer network remains, where the reduced set of nonlinear principal components are obtained as part of the training procedure for establishing the IT network. Dong and McAvoy [16] introduced an alternative approach that divides the 5 layer autoassociative network topology into two 3 layer topologies, which, in turn, represent the nonlinear mapping and demapping functions. The output of the first network, that is the mapping layer, are…”

Section: Introductionmentioning

confidence: 99%

“…[31] presented a critical review of the techniques in references [68,16] and argued that the incorporation of a principal curve algorithm into a neural network structure [16] may only cover a limited class of nonlinear functions. Hence, the IT network topology [68] may provide a more effective nonlinear compression than the technique by Dong and McAvoy [16].…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Developments and Applications of Nonlinear Principal Component Analysis – a Review

Krüger

Zhang

Xie

2008

Lecture Notes in Computational Science and Enginee

View full text Add to dashboard Cite

Summary.Although linear principal component analysis (PCA) originates from the work of Sylvester [67] and Pearson [51], the development of nonlinear counterparts has only received attention from the 1980s. Work on nonlinear PCA, or NLPCA, can be divided into the utilization of autoassociative neural networks, principal curves and manifolds, kernel approaches or the combination of these approaches. This article reviews existing algorithmic work, shows how a given data set can be examined to determine whether a conceptually more demanding NLPCA model is required and lists developments of NLPCA algorithms. Finally, the paper outlines problem areas and challenges that require future work to mature the NLPCA research field. IntroductionPCA is a data analysis technique that relies on a simple transformation of recorded observation, stored in a vector z ∈ R N , to produce statistically independent score variables, stored in t ∈ R n , n ≤ N :(1.1)Here, P is a transformation matrix, constructed from orthonormal column vectors. Since the first applications of PCA [21], this technique has found its way into a wide range of different application areas, for example signal processing [75], factor analysis [29,44], system identification [77], chemometrics [20,66] and more recently, general data mining [11,70,58] including image processing [17,72] and pattern recognition [47,10], as well as process 2 U. Kruger, J. Zhang, and L. Xie monitoring and quality control [1,82] including multiway [48], multiblock [52] and multiscale [3] extensions. This success is mainly related to the ability of PCA to describe significant information/variation within the recorded data typically by the first few score variables, which simplifies data analysis tasks accordingly. Sylvester [67] formulated the idea behind PCA, in his work the removal of redundancy in bilinear quantics, that are polynomial expressions where the sum of the exponents are of an order greater than 2, and Pearson [51] laid the conceptual basis for PCA by defining lines and planes in a multivariable space that present the closest fit to a given set of points. Hotelling [28] then refined this formulation to that used today. Numerically, PCA is closely related to an eigenvector-eigenvalue decomposition of a data covariance, or correlation matrix and numerical algorithms to obtain this decomposition include the iterative NIPALS algorithm [78], which was defined similarly by Fisher and MacKenzie earlier on [80], and the singular value decomposition. Good overviews concerning PCA are given in Mardia et al. [45], Joliffe [32], Wold et al. [80] and Jackson [30].The aim of this article is to review and examine nonlinear extensions of PCA that have been proposed over the past two decades. This is an important research field, as the application of linear PCA to nonlinear data may be inadequate [49]. The first attempts to present nonlinear PCA extensions include a generalization, utilizing a nonmetric scaling, that produces a nonlinear optimization problem [42] and constructing a curves...

show abstract

Nonlinear principal component analysis—Based on principal curves and neural networks

Cited by 578 publications

References 27 publications

Learning and inference in the brain

Learning and inference in the brain

Functional integration and inference in the brain

Developments and Applications of Nonlinear Principal Component Analysis – a Review

Contact Info

Product

Resources

About