Classification methods are fundamental techniques designed to find mathematical models that are able to recognize the membership of each object to its proper class on the basis of a set of measurements. The issue of classifying objects into groups when variables in an experiment are large will cause the misclassification problems. This study explores the approaches for tackling the classification problem of a large number of independent variables using parametric method namely PLS-DA and PCA+LDA. Data are generated using data simulator; Azure Machine Learning (AML) studio through custom R module. The performance analysis of the PLS-DA was conducted and compared with PCA+LDA model using different number of variables (p) and different sample sizes (n). The performance of PLS-DA and PCA+LDA has been evaluated based on minimum misclassification rate. The results demonstrated that PLS-DA performed better than the PCA+LDA for large sample size. PLS-DA can be considered to have a good and reliable technique to be used when dealing with large datasets for classification task.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.