protein particles using self-organizing maps. J. Lipid Res . 2010. 51: 431-439.
Supplementary key words metabolism • lipids • ultracentrifugation • subfractions • unsupervised data analysisLipoprotein metabolism plays a key role in health and disease. Related measures, such as HDL and LDL cholesterol, are in common use to describe individuals' overall metabolic status and the potential risk for atherosclerosis and vascular complications. However, lipoprotein metabolism appears signifi cantly more complex than just an interplay between HDL and LDL. Thus, information on lipoprotein subpopulations is presently needed to appreciate the various counteracting metabolic phenomena and also to more accurately assess the risk for various vascular outcomes ( 1-3 ).It is well established that the liver plays a central role in the apolipoprotein B (apoB) particle cascade, i.e . , the endogenous transport system of lipids to various tissues, by producing and secreting VLDL particles into the circulation ( 4-6 ). This new SOM approach was applied here to analyze and interpret the individual multivariate lipoprotein lipid data.
LogicA characteristic feature of the SOMs is their ability to map nonlinear relations in multidimensional data sets into visually more approachable, typically two-dimensional planes of nodes. The overall concept of SOM analysis is illustrated in Fig. 1 . The input data to the SOM from each case i , i.e., from each plasma sample in this particular application, contain a number of variables used to form a vector. The SOM algorithm ( 21, 26 ) then transforms the input data vectors into a two-dimensional map in which each node j,k ( j goes over the rows and k over the columns, total of J rows and K columns) will be represented by a single feature vectorrepresenting the original N dimensional space, i.e . , the input data. After the self-organizing process, the point density of the feature vectors follows roughly the probability density of the data, thereby making SOM a valuable tool for detecting similarities and groupings in a data set. The training algorithm is rather simple (and also robust to missing values), and it is easy to visualize the resulting maps. The feature vectors of the neighboring nodes in the two-dimensional map are similar to each other and thereby, importantly, the individuals ending up in nodes close by are similar also in the original N dimensional space ( 21,24,27 ).The visualization phase of the SOM analysis is two-fold: fi rst, to look at potential constellations of nodes (feature vectors) formed that would describe similar individuals (groups) in the original variable space; second, to depict input (or other related) variables over the two-dimensional map in order to obtain a quick overview of their distribution and values in different nodes, i.e., in the case of each feature vector. In other words, each node describes a model individual, which, in turn, bares a link to the individuals specifi ed in the original N dimensional space. The SOM algorithm, thus, offers the possibility t...