The potential of the diverse chemistries present in natural products (NP) for biotechnology and medicine remains untapped because NP databases are not searchable with raw data and the NP community has no way to share data other than in published papers. Although mass spectrometry techniques are well-suited to high-throughput characterization of natural products, there is a pressing need for an infrastructure to enable sharing and curation of data. We present Global Natural Products Social molecular networking (GNPS, http://gnps.ucsd.edu), an open-access knowledge base for community wide organization and sharing of raw, processed or identified tandem mass (MS/MS) spectrometry data. In GNPS crowdsourced curation of freely available community-wide reference MS libraries will underpin improved annotations. Data-driven social-networking should facilitate identification of spectra and foster collaborations. We also introduce the concept of ‘living data’ through continuous reanalysis of deposited data.
Biosynthetic pathways of secondary metabolites from fungi are currently subject to an intense effort to elucidate the genetic basis for these compounds due to their large potential within pharmaceutics and synthetic biochemistry. The preferred method is methodical gene deletions to identify supporting enzymes for key synthases one cluster at a time. In this study, we design and apply a DNA expression array for Aspergillus nidulans in combination with legacy data to form a comprehensive gene expression compendium. We apply a guilt-by-association-based analysis to predict the extent of the biosynthetic clusters for the 58 synthases active in our set of experimental conditions. A comparison with legacy data shows the method to be accurate in 13 of 16 known clusters and nearly accurate for the remaining 3 clusters. Furthermore, we apply a data clustering approach, which identifies cross-chemistry between physically separate gene clusters (superclusters), and validate this both with legacy data and experimentally by prediction and verification of a supercluster consisting of the synthase AN1242 and the prenyltransferase AN11080, as well as identification of the product compound nidulanin A. We have used A. nidulans for our method development and validation due to the wealth of available biochemical data, but the method can be applied to any fungus with a sequenced and assembled genome, thus supporting further secondary metabolite pathway elucidation in the fungal kingdom.aspergilli | natural products | secondary metabolism | polyketide synthases N o other group of biochemical compounds holds as much promise for drug development as the secondary (nongrowth associated) metabolites (SMs). A review from 2012 (1) found that for small-molecule pharmaceuticals, 68% of the anticancer agents and 52% of the antiinfective agents are natural products, or derived from natural products. The fact that SMs are often synthesized as polymer backbones that are subsequently diversified greatly via the actions of tailoring enzymes sets the stage for combinatorial biochemistry (2), because their biosynthesis is modular.Major groups of SMs include polyketides (PKs) consisting of -CH 2 -(C = O)-units, ribosomal and nonribosomomal peptides (NRPs), and terpenoids made from C 5 isoprene units. These polymer backbones are, with the exception of ribosomal peptides, made by synthases or synthetases and are modified by a plethora of tailoring enzymes, including (de)hydratases, oxygenases, hydrolases, methylases, and others.In fungi, these biosynthetic genes of secondary metabolism are organized in discrete clusters around the synthase genes. Although quite accurate algorithms are available for identification of possible SM biosynthetic genes, particularly PK synthases (PKSs), NRP synthetases (NRPSs), and dimethylallyl tryptophan synthases (DMATSs) (3, 4), the assignment and prediction of the members of the individual clusters solely from the genome sequence have not been accurate. Relevant protein domains can be predicted for some of the genes (e....
Streptomycetes serve as major producers of various pharmacologically and industrially important natural products. Although CRISPR-Cas9 systems have been developed for more robust genetic manipulations, concerns of genome instability caused by the DNA double-strand breaks (DSBs) and the toxicity of Cas9 remain. To overcome these limitations, here we report development of the DSB-free, single-nucleotide–resolution genome editing system CRISPR-BEST (CRISPR-Base Editing SysTem), which comprises a cytidine (CRISPR-cBEST) and an adenosine (CRISPR-aBEST) deaminase-based base editor. Specifically targeted by an sgRNA, CRISPR-cBEST can efficiently convert a C:G base pair to a T:A base pair and CRISPR-aBEST can convert an A:T base pair to a G:C base pair within a window of approximately 7 and 6 nucleotides, respectively. CRISPR-BEST was validated and successfully used in different Streptomyces species. Particularly in nonmodel actinomycete Streptomyces collinus Tü365, CRISPR-cBEST efficiently inactivated the 2 copies of kirN gene that are in the duplicated kirromycin biosynthetic pathways simultaneously by STOP codon introduction. Generating such a knockout mutant repeatedly failed using the conventional DSB-based CRISPR-Cas9. An unbiased, genome-wide off-target evaluation indicates the high fidelity and applicability of CRISPR-BEST. Furthermore, the system supports multiplexed editing with a single plasmid by providing a Csy4-based sgRNA processing machinery. To simplify the protospacer identification process, we also updated the CRISPy-web (https://crispy.secondarymetabolites.org), and now it allows designing sgRNAs specifically for CRISPR-BEST applications.
In drug discovery, reliable and fast dereplication of known compounds is essential for identification of novel bioactive compounds. Here, we show an integrated approach using ultra-high performance liquid chromatography-diode array detection-quadrupole time of flight mass spectrometry (UHPLC-DAD-QTOFMS) providing both accurate mass full-scan mass spectrometry (MS) and tandem high resolution MS (MS/HRMS) data. The methodology was demonstrated on compounds from bioactive marine-derived strains of Aspergillus, Penicillium, and Emericellopsis, including small polyketides, non-ribosomal peptides, terpenes, and meroterpenoids. The MS/HRMS data were then searched against an in-house MS/HRMS library of ~1300 compounds for unambiguous identification. The full scan MS data was used for dereplication of compounds not in the MS/HRMS library, combined with ultraviolet/visual (UV/Vis) and MS/HRMS data for faster exclusion of database search results. This led to the identification of four novel isomers of the known anticancer compound, asperphenamate. Except for very low intensity peaks, no false negatives were found using the MS/HRMS approach, which proved to be robust against poor data quality caused by system overload or loss of lock-mass. Only for small polyketides, like patulin, were both retention time and UV/Vis spectra necessary for unambiguous identification. For the ophiobolin family with many structurally similar analogues partly co-eluting, the peaks could be assigned correctly by combining MS/HRMS data and m/z of the [M + Na]+ ions.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.