FDR control has been a huge challenge for large-scale metabolome annotation. Although recent research indicated that the target− decoy strategy could be implemented to estimate FDR, it is hard to perform FDR control due to the difficulty of getting a reliable decoy database because of the complex fragmentation mechanism of metabolites and ubiquitous isomers. To tackle this problem, we developed a decoy generation method, which generates forged spectra from the reference target database by preserving the original reference signals to simulate the presence of isomers of metabolites. Benchmarks on GNPS data sets in Passatutto showed that the decoy database generated by our method is closer to the actual FDR than other methods, especially in the low FDR range (0−0.05). Large-scale metabolite annotation on 35 data sets showed that strict FDR reduced the number of annotated metabolites but increased the spectral efficiency, indicating the necessity of quality control. We recommended that the FDR threshold should be set to 0.01 in large-scale metabolite annotation. We implemented decoy generation, database search, and FDR control into a search engine called XY-Meta. It facilitates large-scale metabolome annotation applications.
2Batch inconsistency is a major problem when applying LC-MS based untargeted 1 3 metabolomics in real-time analysis situation such as clinical diagnosis or health monitoring. 4And inefficiency of collecting MS2 is a major problem for metabolite identification. Here, we 1 5 developed a reference-feature based quantification and identification strategy (RFQI). In 1 6 RFQI, samples are individually profiled using a pre-fixed reference feature table. 7Quantification results show that RFQI improves features'overlap rate and reduce variance 1 8 across batches significantly in real-time-analysis mode, and can find more than 4-fold 1 9 numbers of features. Besides, RFQI collects MS2 from consecutive increasing samples for 2 0 metabolite identification of pre-fixed features, thus it can effectively compensate for the poor 2 1 efficiency of MS2 collection in data-dependent acquisition mode. In summary, RFQI can 2 2 make full advantage of consecutive increasing samples in real-time analysis situation, both for 2 3 quantification and identification. 2 4 Key words: batch effect, LC-MS based untargeted metabolomics, metabolite identification 2 5 2 6 2 7
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.