Antimicrobial resistance is an emerging global health threat necessitating the rapid development of novel antimicrobials. Remarkably, the vast majority of currently available antibiotics are natural products (NPs) isolated from streptomycetes, soil-dwelling bacteria of the genus Streptomyces. However, there is still a huge reservoir of streptomycetes NPs which remains pharmaceutically untapped and a compendium thereof could serve as a source of inspiration for the rational design of novel antibiotics. Initially released in 2012, StreptomeDB (http://www.pharmbioinf.uni-freiburg.de/streptomedb) is the first and only public online database that enables the interactive phylogenetic exploration of streptomycetes and their isolated or mutasynthesized NPs. In this third release, there are substantial improvements over its forerunners, especially in terms of data content. For instance, about 2500 unique NPs were newly annotated through manual curation of about 1300 PubMed-indexed articles, published in the last five years since the second release. To increase interoperability, StreptomeDB entries were hyperlinked to several spectral, (bio)chemical and chemical vendor databases, and also to a genome-based NP prediction server. Moreover, predicted pharmacokinetic and toxicity profiles were added. Lastly, some recent real-world use cases of StreptomeDB are highlighted, to illustrate its applicability in life sciences.
In recent years, the drug discovery paradigm has shifted toward compounds that covalently modify disease-associated target proteins, because they tend to possess high potency, selectivity, and duration of action. The rational design of novel targeted covalent inhibitors (TCIs) typically starts from resolved macromolecular structures of target proteins in their apo or holo forms. However, the existing TCI databases contain only a paucity of covalent protein–ligand (cP–L) complexes. Herein, we report CovPDB, the first database solely dedicated to high-resolution cocrystal structures of biologically relevant cP–L complexes, curated from the Protein Data Bank. For these curated complexes, the chemical structures and warheads of pre-reactive electrophilic ligands as well as the covalent bonding mechanisms to their target proteins were expertly manually annotated. Totally, CovPDB contains 733 proteins and 1,501 ligands, relating to 2,294 cP–L complexes, 93 reactive warheads, 14 targetable residues, and 21 covalent mechanisms. Users are provided with an intuitive and interactive web interface that allows multiple search and browsing options to explore the covalent interactome at a molecular level in order to develop novel TCIs. CovPDB is freely accessible at http://www.pharmbioinf.uni-freiburg.de/covpdb/ and its contents are available for download as flat files of various formats.
Medicinal plants have widely been used in the traditional treatment of ailments and have been proven effective. Their contribution still holds an important place in modern drug discovery due to their chemical, and biological diversities. However, the poor documentation of traditional medicine, in developing African countries for instance, can lead to the loss of knowledge related to such practices. In this study, we present the Eastern Africa Natural Products Database (EANPDB) containing the structural and bioactivity information of 1870 unique molecules isolated from about 300 source species from the Eastern African region. This represents the largest collection of natural products (NPs) from this geographical region, covering literature data of the period from 1962 to 2019. The computed physicochemical properties and toxicity profiles of each compound have been included. A comparative analysis of some physico‐chemical properties like molecular weight, H‐bond donor/acceptor, logPo/w, etc. as well scaffold diversity analysis has been carried out with other published NP databases. EANPDB was combined with the previously published Northern African Natural Products Database (NANPDB), to form a merger African Natural Products Database (ANPDB), containing ∼6500 unique molecules isolated from about 1000 source species (freely available at http://african‐compounds.org). As a case study, latrunculins A and B isolated from the sponge Negombata magnifica (Podospongiidae) with previously reported antitumour activities, were identified via substructure searching as molecules to be explored as putative binders of histone deacetylases (HDACs).
MotivationMuch effort has been invested in the identification of protein-protein interactions using text mining and machine learning methods. The extraction of functional relationships between chemical compounds and proteins from literature has received much less attention, and no ready-to-use open-source software is so far available for this task. MethodWe created a new benchmark dataset of 2,613 sentences from abstracts containing annotations of proteins, small molecules, and their relationships. Two kernel methods were applied to classify these relationships as functional or non-functional, named shallow linguistic and allpaths graph kernel. Furthermore, the benefit of interaction verbs in sentences was evaluated. ResultsThe cross-validation of the all-paths graph kernel (AUC value: 84.6%, F1 score: 79.0%) shows slightly better results than the shallow linguistic kernel (AUC value: 82.5%, F1 score: 77.2%) on our benchmark dataset. Both models achieve state-of-the-art performance in the research area of relation extraction. Furthermore, the combination of shallow linguistic and all-paths graph kernel could further increase the overall performance slightly. We used each of the two kernels to identify functional relationships in all PubMed abstracts (29 million) and provide the results, including recorded processing time. Availability
Newly discovered functional relationships of (bio-)molecules are a key component in molecular biology and life science research. Especially in the drug discovery field, knowledge of how small molecules associate with proteins plays a fundamental role in understanding how drugs or metabolites can affect cells, tissues, and human metabolism. Finding relevant information about these relationships among the huge number of published articles is becoming increasingly challenging and time-consuming. On average, more than 25,000 new (bio-)medical articles are added to the literature database PubMed weekly. In this work, we present a new web server (CPRiL) which provides information on functional relationships between small molecules and proteins in literature. Currently, CPRiL contains ∼465,000 unique names and synonyms of small molecules, ∼100,000 unique proteins, and more than 9 million described functional relationships between these entities. The applied BioBERT machine learning model for the determination of functional relationships between small molecules and proteins in texts was extensively trained and tested. On a related benchmark, CPRiL yielded a high performance, with an F1-score of 84.3%, precision of 82.9%, and recall of 85.7%. Availability CPRiL is freely available at https://www.pharmbioinf.uni-freiburg.de/cpril.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.