This paper proposes a structural classification based correlation and application to principal component analysis (PCA) for high-dimension low-sample size (HDLSS) data. The structural classification based correlation consists of two kinds of correlations; correlation of objects over variables and correlation of classification structures of objects over clusters. Therefore, this correlation can measure not only the similarity of objects but also the similarity of classification structures. We exploit this correlation to PCA whose target data is HDLSS data in which the number of variables is much larger than the number of objects. Since it is known that we cannot obtain correct solutions as the eigen-values of the covariance matrix of variables for HDLSS data and the result of ordinary PCA is based on eigen-values of the covariance matrix of variables, if we apply the ordinary PCA for HDLSS data, we cannot obtain the correct result. In order to solve this problem, we exploit the proposed structural classification based correlation with respect to variables. Since this correlation includes the correlation of classification structures, we can solve this problem and obtain a similarity relationship of objects in a lower dimensional space spanned by the obtained principal components. From several numerical examples, we show the effectiveness of our proposed principal component analysis using the structural classification based correlation for the HDLSS data.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.