2021
DOI: 10.3390/metabo11090631
|View full text |Cite
|
Sign up to set email alerts
|

A New Pipeline for the Normalization and Pooling of Metabolomics Data

Abstract: Pooling metabolomics data across studies is often desirable to increase the statistical power of the analysis. However, this can raise methodological challenges as several preanalytical and analytical factors could introduce differences in measured concentrations and variability between datasets. Specifically, different studies may use variable sample types (e.g., serum versus plasma) collected, treated, and stored according to different protocols, and assayed in different laboratories using different instrume… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
22
0
1

Year Published

2021
2021
2024
2024

Publication Types

Select...
7
1

Relationship

4
4

Authors

Journals

citations
Cited by 17 publications
(23 citation statements)
references
References 37 publications
0
22
0
1
Order By: Relevance
“…Another limitation is that the large sample size was achieved by pooling data from different previous studies, rather than by initial design, therefore adding methodological complexity because of analyses performed by different laboratories, with different instruments, and on different biological matrices. However, the analytical protocol used has shown high inter-laboratory reproducibility [60], and we addressed potential heterogeneity in metabolite concentrations by developing a dedicated pipeline [24] applied to the data prior to statistical analyses. In addition, for all metabolites included (except asparagine, not evaluated), high correlations were reported between measures in serum and in plasma (r ≥ 0.78, except for arginine, r = 0.50), although concentrations were generally higher in serum than in plasma, in particular for arginine [61].…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…Another limitation is that the large sample size was achieved by pooling data from different previous studies, rather than by initial design, therefore adding methodological complexity because of analyses performed by different laboratories, with different instruments, and on different biological matrices. However, the analytical protocol used has shown high inter-laboratory reproducibility [60], and we addressed potential heterogeneity in metabolite concentrations by developing a dedicated pipeline [24] applied to the data prior to statistical analyses. In addition, for all metabolites included (except asparagine, not evaluated), high correlations were reported between measures in serum and in plasma (r ≥ 0.78, except for arginine, r = 0.50), although concentrations were generally higher in serum than in plasma, in particular for arginine [61].…”
Section: Discussionmentioning
confidence: 99%
“…A specific statistical pipeline was developed [24] and applied on raw metabolite concentrations (before exclusion of hormone users) to adequately pool measures obtained from different studies, instruments, and laboratories. This pipeline was shown to be efficient in removing unwanted variability and improving the comparability of measurements acquired across different nested studies.…”
Section: Statistical Analyses Normalization Of Metabolite Concentrationsmentioning
confidence: 99%
“…Going forward, metaboprep will be developed to address evolving needs, starting with additional functionality to enable direct read-in of the new (since 2021) format datafiles supplied by Metabolon. Perhaps unsurprisingly, given the rapid increase in use of metabolomics data in epidemiology, parallel efforts are being made to improve analytical efficiency, such as the recent release of the R package maplet (Metabolomics Analysis PipeLinE Toolbox; Chetnik et al , 2022 ), and to construct pipelines for combining metabolomic datasets across cohorts ( Viallon et al , 2021 ); any future developments of metaboprep will necessarily be made within this context. Our package does not provide any tools for statistical analysis or downstream interpretation, and therefore, we anticipate that metaboprep will be used in conjunction with complementary tools such as MetaboAnalyst ( Pang et al , 2021 ), which provides a broader set of functions to aid raw MS spectra processing as well as post-analytical biomarker analysis.…”
Section: Discussionmentioning
confidence: 99%
“…Moreover, following the rationale of the lasso-OLS hybrid 41 , associations identified by the data shared lasso were further inspected using unpenalized conditional logistic regression models, (i) to quantify their strength and investigate possible heterogeneity among the type-specific associations beyond those identified by the data shared lasso (see Section 3 in Additional file 1 for details); (ii) to assess possible departure from linearity by comparing models with natural cubic splines to models with linear terms only; and (iii) to assess possible attenuation after excluding, in turn, first two and first seven years of follow-up (to examine potential reverse causation and more generally assess the impact of time to diagnosis on our findings), and after adjustment for additional factors (education level, waist circumference, height, physical activity, smoking status, alcohol intake, use of non-steroidal anti-inflammatory drugs, and, for women, menopausal status and phase of menstrual cycle in premenopausal women). Finally, effect modification by BMI was assessed under standard (i.e., non-conditional) logistic regression models after breaking the matching and correcting metabolite measurements for batch and study effects 30 .…”
Section: Methodsmentioning
confidence: 99%
“…Selection of the metabolites, data pre-processing. Data were pre-processed following an established procedure 30 . Briefly, metabolites with more than 25% missing values in any study were excluded.…”
Section: Laboratory Analysis As Summarized In Tablementioning
confidence: 99%