2016
DOI: 10.1007/s11306-016-1026-5
|View full text |Cite
|
Sign up to set email alerts
|

Normalization and integration of large-scale metabolomics data using support vector regression

Abstract: Introduction Untargeted metabolomics studies for biomarker discovery often have hundreds to thousands of human samples. Data acquisition of large-scale samples has to be divided into several batches and may span from months to as long as several years. The signal drift of metabolites during data acquisition (intra-and inter-batch) is unavoidable and is a major confounding factor for largescale metabolomics studies. Objectives We aim to develop a data normalization method to reduce unwanted variations and integ… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
119
0
1

Year Published

2017
2017
2024
2024

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 143 publications
(121 citation statements)
references
References 42 publications
1
119
0
1
Order By: Relevance
“…20% loss in features . On the other hand during long runs the signal intensity of metabolites drifts due to MS instrument contamination, matrix effects, or LC column degradation . Consequently, long method limits significantly a number of samples that can be analyzed in a single batch.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…20% loss in features . On the other hand during long runs the signal intensity of metabolites drifts due to MS instrument contamination, matrix effects, or LC column degradation . Consequently, long method limits significantly a number of samples that can be analyzed in a single batch.…”
Section: Resultsmentioning
confidence: 99%
“…Consequently, long method limits significantly a number of samples that can be analyzed in a single batch. Although different strategies for integration of several analytical batches of fingerprinting data were proposed , signal variation between batches makes integration of data challenging and great care for experimental design, data acquisition, quality control, and subsequent data analysis is required . Therefore, the concept of the present study was to develop a methodology for lung tissue fingerprinting with single analytical platform (LC‐QTOF‐MS) that can detect metabolites from several classes and at the same time is relatively short to cover a large number of samples in a single batch.…”
Section: Resultsmentioning
confidence: 99%
“…For example, in the identification of novel per- and polyfluorinated compounds in human serum, HRMS data was normalized using an internal standard peak area ( 13 C 4 PFOS) and total area sums (Rotander et al, 2015). The latter approach involves the use of a pooled QC sample across the batch and applies computational models to correct the abundance deviation of a molecular feature in a sample according to its performance in the neighboring QC run using a locally estimated scatterplot smoothing (LOESS) method (Shen et al, 2016b). Data scaling and transformation compares molecular features within and between samples, and is considered a between chromatograms or column-wise correction (van den Berg et al, 2006).…”
Section: High Resolution Data Extraction Features For Scaling the mentioning
confidence: 99%
“…MetNormalizer , is an R package developed to reduce confounding factors and unwanted variations, such as intra‐ and interbatch variability, using the machine learning algorithm‐based method, support vector regression normalization .…”
Section: Data Preprocessing Toolsmentioning
confidence: 99%