Wikidata is a community-maintained knowledge base that has been assembled from repositories in the fields of genomics, proteomics, genetic variants, pathways, chemical compounds, and diseases, and that adheres to the FAIR principles of findability, accessibility, interoperability and reusability. Here we describe the breadth and depth of the biomedical knowledge contained within Wikidata, and discuss the open-source tools we have built to add information to Wikidata and to synchronize it with source databases. We also demonstrate several use cases for Wikidata, including the crowdsourced curation of biomedical ontologies, phenotype-based diagnosis of disease, and drug repurposing.
Tandem mass spectrometry based shotgun proteomics of distal gut microbiomes is exceedingly difficult due to the inherent complexity and taxonomic diversity of the samples. We introduce two new methodologies to improve metaproteomic studies of microbiome samples. These methods include the stable isotope labeling in mammals to permit protein quantitation across two mouse cohorts, as well as the application of activity-based probes to enrich and analyze both host and microbial proteins with specific functionalities. We used these technologies to study the microbiota from the adoptive T cell transfer mouse model of inflammatory bowel disease (IBD) and compare these samples to an isogenic control; thereby, limiting genetic and environmental variables that influence microbiome composition. The data and results generated highlight quantitative alterations in both host and microbial proteins due to intestinal inflammation and corroborates the observed phylogenetic changes in bacteria that accompany IBD in humans and mouse models. The combination of isotope labeling with shotgun proteomics resulted in the total identification of 4434 protein clusters expressed in the microbial proteomic environment, 276 of which demonstrated differential abundance between control and IBD mice. Notably, application of a novel cysteine-reactive probe uncovered several microbial proteases and hydrolases overrepresented in the IBD mice. Implementation of these methods demonstrates that substantial insights into the identity and dysregulation of host and microbial proteins altered in IBD can be accomplished and can be used in the interrogation of other microbiome-related diseases.
Motivation Drug repositioning is an attractive alternative to de novo drug discovery due to reduced time and costs to bring drugs to market. Computational repositioning methods, particularly non-black-box methods that can account for and predict a drug’s mechanism, may provide great benefit for directing future development. By tuning both data and algorithm to utilize relationships important to drug mechanisms, a computational repositioning algorithm can be trained to both predict and explain mechanistically novel indications. Results In this work, we examined the 123 curated drug mechanism paths found in the drug mechanism database (DrugMechDB) and after identifying the most important relationships, we integrated 18 data sources to produce a heterogeneous knowledge graph, MechRepoNet, capable of capturing the information in these paths. We applied the Rephetio repurposing algorithm to MechRepoNet using only a subset of relationships known to be mechanistic in natureand found adequate predictive ability on an evaluation set with AUROC value of 0.83. The resulting repurposing model allowed us to prioritize paths in our knowledge graph to produce a predicted treatment mechanism. We found that DrugMechDB paths, when present in the network were rated highly among predicted mechanisms. We then demonstrated MechRepoNet’s ability to use mechanistic insight to identify a drug’s mechanistic target, with a mean reciprocal rank of 0.525 on a test set of known drug-target interactions. Finally, we walked through repurposing examples of the anti-cancer drug imatinib for use in the treatment of asthma, and metolazone for use in the treatment of osteoporosis, to demonstrate this method’s utility in providing mechanistic insight into repurposing predictions it provides. Availability The Python code to reproduce the entirety of this analysis is available at: https://github.com/SuLab/MechRepoNet Supplementary information Supplementary data are available at Bioinformatics online.
BackgroundComputational compound repositioning has the potential for identifying new uses for existing drugs, and new algorithms and data source aggregation strategies provide ever-improving results via in silico metrics. However, even with these advances, the number of compounds successfully repositioned via computational screening remains low. New strategies for algorithm evaluation that more accurately reflect the repositioning potential of a compound could provide a better target for future optimizations.ResultsUsing a text-mined database, we applied a previously described network-based computational repositioning algorithm, yielding strong results via cross-validation, averaging 0.95 AUROC on test-set indications. However, to better approximate a real-world scenario, we built a time-resolved evaluation framework. At various time points, we built networks corresponding to prior knowledge for use as a training set, and then predicted on a test set comprised of indications that were subsequently described. This framework showed a marked reduction in performance, peaking in performance metrics with the 1985 network at an AUROC of .797. Examining performance reductions due to removal of specific types of relationships highlighted the importance of drug-drug and disease-disease similarity metrics. Using data from future timepoints, we demonstrate that further acquisition of these kinds of data may help improve computational results.ConclusionsEvaluating a repositioning algorithm using indications unknown to input network better tunes its ability to find emerging drug indications, rather than finding those which have been randomly withheld. Focusing efforts on improving algorithmic performance in a time-resolved paradigm may further improve computational repositioning predictions.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.