This mini review discusses advantages, limitations, and examples of different mass spectrometry ionization sources applicable to natural product discovery workflows.
Even though raw mass spectrometry data is information rich, the vast majority of the data is underutilized. The ability to interrogate these rich datasets is handicapped by the limited capability and flexibility of existing software. We introduce the Mass Spec Query Language (MassQL) that addresses these issues by enabling an expressive set of mass spectrometry patterns to be queried directly from raw data. MassQL is an open-source mass spectrometry query language for flexible and mass spectrometer manufacturer-independent mining of MS data. We envision the flexibility, scalability, and ease of use of MassQL will empower the mass spectrometry community to take fuller advantage of their mass spectrometry data and accelerate discoveries.
Mutations can occur throughout the virus genome and may be beneficial, neutral or deleterious. We are interested in mutations that yield a C next to a G, producing CpG sites. CpG sites are rare in eukaryotic and viral genomes. For the eukaryotes, it is thought that CpG sites are rare because they are prone to mutation when methylated. In viruses, we know less about why CpG sites are rare. A previous study in HIV suggested that CpG-creating transition mutations are more costly than similar non-CpG-creating mutations. To determine if this is the case in other viruses, we analyzed the allele frequencies of CpG-creating and non-CpG-creating mutations across various strains, subtypes, and genes of viruses using existing data obtained from Genbank, HIV Databases, and Virus Pathogen Resource. Our results suggest that CpG sites are indeed costly for most viruses. By understanding the cost of CpG sites, we can obtain further insights into the evolution and adaptation of viruses.
Metabolites give us a window into the chemistry of microbes and are split into two subclasses: primary and secondary. Primary metabolites are required for life whereas secondary metabolites have historically been classified as those appearing after exponential growth and are not necessarily needed for survival. Many microbial species are estimated to produce hundreds of metabolites and can be affected by differing nutrients. Using various analytical techniques, metabolites can be directly detected in order to elucidate their biological significance. Currently, a single experiment can produce anywhere from megabytes to terabytes of data. This big data has motivated scientists to develop informatics tools to help target specific metabolites or sets of metabolites. Broadly, it is imperative to identify clear biological questions before embarking on a study of metabolites (metabolomics). For instance, studying the effect of a transposon insertion on phenazine biosynthesis in Pseudomonas is a very different from asking what molecules are present in a specific banana-derived strain of Pseudomonas. This review is meant to serve as a primer for a ‘choose your own adventure’ approach for microbiologists with limited mass spectrometry expertise, with a strong focus on liquid chromatography mass spectrometry based workflows developed or optimized within the past five years.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.