2016
DOI: 10.2174/2213235x04666160613122429
|View full text |Cite
|
Sign up to set email alerts
|

PCA as a Practical Indicator of OPLS-DA Model Reliability

Abstract: Background Principal Component Analysis (PCA) and Orthogonal Projections to Latent Structures Discriminant Analysis (OPLS-DA) are powerful statistical modeling tools that provide insights into separations between experimental groups based on high-dimensional spectral measurements from NMR, MS or other analytical instrumentation. However, when used without validation, these tools may lead investigators to statistically unreliable conclusions. This danger is especially real for Partial Least Squares (PLS) and OP… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

4
231
1
2

Year Published

2016
2016
2022
2022

Publication Types

Select...
9

Relationship

1
8

Authors

Journals

citations
Cited by 339 publications
(238 citation statements)
references
References 22 publications
4
231
1
2
Order By: Relevance
“…Another study discussed the fact that the now-conventionally used statistical methods such as PLS-DA and OPLS-DA can mislead in the absence of proper validation and provides practical guidelines and cross-validation recommendations for reliable inference from PCA and OPLS-DA models. 65 …”
Section: Discussionmentioning
confidence: 99%
“…Another study discussed the fact that the now-conventionally used statistical methods such as PLS-DA and OPLS-DA can mislead in the absence of proper validation and provides practical guidelines and cross-validation recommendations for reliable inference from PCA and OPLS-DA models. 65 …”
Section: Discussionmentioning
confidence: 99%
“…Since the numbers of replicates for a metabolomics data set are typically far fewer then the number of variables, overfitting the data, especially for supervised techniques like PLS or OPLS, is a serious concern [65, 66, 87]. In fact, a PLS/OPLS model can produce the appearance of a clear group separation even for noise or completely random data [88].…”
Section: Brief Overview Of Metabolomicsmentioning
confidence: 99%
“…While chemometric techniques regularly streamline the process of data analysis, these advanced multivariate statistical techniques are routinely used incorrectly, lack proper validation, and have, unfortunately, lead to a proliferation of erroneous data in the scientific literature [65, 66]. This problem becomes compounded when an investigator is combining multiple analytical sources.…”
Section: Introductionmentioning
confidence: 99%
“…All data sets were scaled to unit of variance allowing all metabolites to become equally important. PCA describes the total variability within the data set and can be used as an informative indicator while the PLS‐DA is used to predict the spectral features (metabolites) that define separation between groups (phenotypes) . The overall quality of the models was judged by cumulative R 2 (goodness of fit) and cumulative Q 2 (goodness of prediction).…”
Section: Methodsmentioning
confidence: 99%