2020
DOI: 10.1021/acs.jproteome.0c00192
|View full text |Cite
|
Sign up to set email alerts
|

mzMLb: A Future-Proof Raw Mass Spectrometry Data Format Based on Standards-Compliant mzML and Optimized for Speed and Storage Requirements

Abstract: With ever-increasing amounts of data produced by mass spectrometry (MS) proteomics and metabolomics, and the sheer volume of samples now analyzed, the need for a common open format possessing both file size efficiency and faster read/write speeds has become paramount to drive the next generation of data analysis pipelines. The Proteomics Standards Initiative (PSI) has established a clear and precise XML representation for data interchange, mzML, receiving substantial uptake; nevertheless, storage and file acce… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
14
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
7
1
1

Relationship

1
8

Authors

Journals

citations
Cited by 19 publications
(14 citation statements)
references
References 22 publications
0
14
0
Order By: Relevance
“…However, the file formats used in metabolomics do not provide a sufficient specification for this. The recent development of newer, more metadatarich file formats [48][49][50] for metabolomics tells us there is an increasingly urgent need to standardize the knowledge representation and metadata elements in this community to achieve better interoperability and information exchange. There is a similar problem being worked on in the healthcare space [51], and a similar effort is likely needed when dealing with large, diverse datasets as in metabolomics and multi-omics.…”
Section: Discussionmentioning
confidence: 99%
“…However, the file formats used in metabolomics do not provide a sufficient specification for this. The recent development of newer, more metadatarich file formats [48][49][50] for metabolomics tells us there is an increasingly urgent need to standardize the knowledge representation and metadata elements in this community to achieve better interoperability and information exchange. There is a similar problem being worked on in the healthcare space [51], and a similar effort is likely needed when dealing with large, diverse datasets as in metabolomics and multi-omics.…”
Section: Discussionmentioning
confidence: 99%
“…The universal datafile format developed for the MSI community, imzML, allows for data export into multiple image processing software, but it has been shown to be 3ā€“4 times slower in write speed when compared to the HDF5 format . There are also several other research groups attempting to address this issue by exploring file format optimization for MS files. , However, these alternative formats have yet to gain traction with the broader community. Computational power continues to improve on basic lab computer workspaces, and cloud-based imaging processing (e.g., through Amazon Web Services) are becoming increasingly utilized for processing imaging data sets, , and as such, we anticipate these will be the key to handling such complicated data over the next decade.…”
Section: Challenges and Future Perspectivesmentioning
confidence: 99%
“… 124 For other analytical data types, no such requirement exists but universal data standards for other methods, the quality of which can be validated, would be beneficial to the whole community. FIDs from NMR spectrometers, 125 CSV files from spectrophotometers, mzML files from mass spectrometers, 126 and similar data from GC/HPLC systems should all be a minimum requirement for publication of results, which rely upon this data.…”
Section: Data Collectionmentioning
confidence: 99%