2016
DOI: 10.1021/acs.jproteome.5b00981
|View full text |Cite
|
Sign up to set email alerts
|

Accounting for the Multiple Natures of Missing Values in Label-Free Quantitative Proteomics Data Sets to Compare Imputation Strategies

Abstract: Missing values are a genuine issue in label-free quantitative proteomics. Recent works have surveyed the different statistical methods to conduct imputation and have compared them on real or simulated data sets and recommended a list of missing value imputation methods for proteomics application. Although insightful, these comparisons do not account for two important facts: (i) depending on the proteomics data set, the missingness mechanism may be of different natures and (ii) each imputation method is devoted… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

8
477
0

Year Published

2017
2017
2024
2024

Publication Types

Select...
6
1
1
1

Relationship

1
8

Authors

Journals

citations
Cited by 379 publications
(485 citation statements)
references
References 35 publications
8
477
0
Order By: Relevance
“…To overcome the problem of missing values without imputation or complex statistical analysis (Koopmans et al , 2014; Lazar et al , 2016; Wang et al , 2017), we applied a hybrid approach for data analysis that treats intensity-based and presence/absence data separately (Fig. 1; Supplementary Fig.…”
Section: Resultsmentioning
confidence: 99%
“…To overcome the problem of missing values without imputation or complex statistical analysis (Koopmans et al , 2014; Lazar et al , 2016; Wang et al , 2017), we applied a hybrid approach for data analysis that treats intensity-based and presence/absence data separately (Fig. 1; Supplementary Fig.…”
Section: Resultsmentioning
confidence: 99%
“…Inferno by Taverner et al (2012) allows for K-Nearest Neighbors (KNN) imputation (Troyanskaya et al, 2001) and in previous versions allowed for imputation from a mixture model proposed by Karpievitch et al (2009). Many more single imputation methods have been evaluated in review papers by Lazar et al (2016) and Webb-Robertson et al (2015), including simple imputations of column means and column minimums as well as an imputation based on the singular value decomposition originally proposed for microarray data (Owen and Perry, 2009). …”
Section: Methodsmentioning
confidence: 99%
“…Lazar et al (2016) reported that missing not at random (MNAR) imputation methods were problematic since the range of imputed values is not representative of the true range of missing values. These MNAR imputations along with the the AFT model assume that every missing value falls below or at some lower limit of detection.…”
Section: Methodsmentioning
confidence: 99%
“…Various external databases, annotation sources, and multiple omics types can be loaded and matched within the software and together with powerful enrichment techniques allow for smooth data integration cleansing is usually performed which includes normalization, to ensure that different samples are comparable, and missing value handling to enable the use of methods that require all data points to be present. A plethora of imputation methods developed for microarray data [13] can be applied to proteomics as well [14]. Among these, methods with the underlying assumption that missing values result from protein expression that lies under the detection limit of modern mass spectrometers are frequently used.…”
Section: Introductionmentioning
confidence: 99%