Fast, axis-agnostic, dynamically summarized storage and retrieval for mass spectrometry data

Handy, Kyle; Rosen, Jeb; Gillan, André; Smith, Rob

doi:10.1371/journal.pone.0188059

Cited by 8 publications

(6 citation statements)

References 18 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Users can control how many points are rendered for the given view. In the event the setting is lower than the actual number of points, JS-MS selects a representative subset of points using the weighted striding algorithm described in [ 9 ]. Set view window (see Fig.…”

Section: Resultsmentioning

confidence: 99%

“…Each query includes a requested limit on the number of points returned, which invokes the server's algorithm for selecting a representative subset of points, allowing for the user to view the characteristics of the data while only seeing a portion of the points in the given (m/z, RT) region. The server implements the MzTree data structure [9], which is a modified R-Tree that organizes the MS1 points in alternating sorting of m/z and RT to provide fast query response whether the data region requested is primarily across m/z, RT, or both.…”

mentioning

confidence: 99%

“…JS-MS 2.0 also includes extra controls for the user to modify view parameters such as point threshold, logarithmic height scaling, and label precision. The point threshold is a function implemented in Java that limits the number of points rendered in a given view, selecting a representative subset of points using the weighted striding algorithm [9]. Applying a point threshold allows for faster load time and graph navigation.…”

mentioning

confidence: 99%

“…The MzTree data structure is a modified R-Tree [9] that interleaves data partitions sorted by RT and m/z for fast queries in either dimension. The previously published version of the data structure did not include the fields required for annotation (such as isotope trace ID and isotopic envelope ID).…”

mentioning

confidence: 99%

See 3 more Smart Citations

A web-based system for creating, viewing, and editing precursor mass spectrometry ground truth data

Henning

Smith

2020

BMC Bioinformatics

Self Cite

View full text Add to dashboard Cite

Background Mass spectrometry (MS) uses mass-to-charge ratios of measured particles to decode the identities and quantities of molecules in a sample. Interpretation of raw MS depends upon data processing algorithms that render it human-interpretable. Quantitative MS workflows are complex experimental chains and it is crucial to know the performance and bias of each data processing method as they impact accuracy, coverage, and statistical significance of the result. Creation of the ground truth necessary for quantitatively evaluating MS1-aware algorithms is difficult and tedious task, and better software for creating such datasets would facilitate more extensive evaluation and improvement of MS data processing algorithms. Results We present JS-MS 2.0, a software suite that provides a dependency-free, browser-based, one click, cross-platform solution for creating MS1 ground truth. The software retains the first version’s capacity for loading, viewing, and navigating MS1 data in 2- and 3-D, and adds tools for capturing, editing, saving, and viewing isotopic envelope and extracted isotopic chromatogram features. The software can also be used to view and explore the results of feature finding algorithms. Conclusions JS-MS 2.0 enables faster creation and inspection of MS1 ground truth data. It is publicly available with an MIT license at github.com/optimusmoose/jsms.

show abstract

Section: Resultsmentioning

confidence: 99%

mentioning

confidence: 99%

mentioning

confidence: 99%

mentioning

confidence: 99%

See 2 more Smart Citations

A web-based system for creating, viewing, and editing precursor mass spectrometry ground truth data

Henning

Smith

2020

BMC Bioinformatics

Self Cite

View full text Add to dashboard Cite

show abstract

“…Regardless, algorithms are limited to taking slices of the data along the retention time (or spectrum) axis only. In contrast, other attempts such as mzDB [4], mzRTree [5], and mzTree [6], focus on random I/O access through the use of an RTree [7] data structure. This allows the data to be accessed along both the m/z (or chromatogram) axis as well as the retention time axis, at the cost of file size or mass accuracy.…”

mentioning

confidence: 99%

Toffee – a highly efficient, lossless file format for DIA-MS

Tully

2019

Preprint

View full text Add to dashboard Cite

The closed nature of vendor file formats in mass spectrometry is a significant barrier to progress in developing robust bioinformatics software. In response, the community has developed the open mzML format, implemented in XML and based on controlled vocabularies [1]. Widely adopted, mzML is an important step forward; however, it suffers from two challenges that are particularly apparent as the field moves to high-throughput proteomics: a) large increase in file size -and corresponding increase in CPU time devoted to I/O, and b) a largely sequential I/O access pattern. Described here is 'toffee', an open, random I/O format backed by HDF5, with lossless compression that gives file sizes similar to the original vendor format and can be reconverted back to mzML without penalty. In addition to the file format, there are C++ and python libraries for creating and accessing the file format, along with a wrapper around OpenSWATH [2] that enables SWATH-MS data to be analyzed with standard algorithms. Using this library, the files can be accessed in the same manner as the Vendor file (or mzML) in a scan-by-scan manner; however, by accepting a degree of mass approximation (<5 parts per million) toffee enables data to be extracted as a two-dimensional slice analogous to an image, and thus amenable to deep-learning based peptide identification strategies. Documentation and examples are available at https://toffee.readthedocs.io, and all code is MIT licensed at https://bitbucket.org/cmriprocan/toffee.There are many previous attempts at new and open formats for mass spectrometry. Some, such as mzML [1], and mz5 [3], aim to be archival formats that faithfully adopt the HUPO PSI guidelines. * https://cmri.org.au/procan; https://brett-tully.id.au

show abstract

mzMD: A New Storage and Retrieval System for Mass Spectrometry Data

Yang

Zhang

et al. 2021

Intelligent Computing Theories and Application

View full text Add to dashboard Cite

Fast, axis-agnostic, dynamically summarized storage and retrieval for mass spectrometry data

Cited by 8 publications

References 18 publications

A web-based system for creating, viewing, and editing precursor mass spectrometry ground truth data

A web-based system for creating, viewing, and editing precursor mass spectrometry ground truth data

Toffee – a highly efficient, lossless file format for DIA-MS

mzMD: A New Storage and Retrieval System for Mass Spectrometry Data

Contact Info

Product

Resources

About