Background
Microbial strain information databases provide valuable data for microbial basic research and applications. However, they rarely contain information on the genetic operating system of microbial strains.
Results
We established a comprehensive microbial strain database, SynBioStrainFinder, by integrating CRISPR/Cas gene-editing system information with cultivation methods, genome sequence data, and compound-related information. It is presented through three modules, Strain2Gms/PredStrain2Gms, Strain2BasicInfo, and Strain2Compd, which combine to form a rapid strain information query system conveniently curated, integrated, and accessible on a single platform. To date, 1426 CRISPR/Cas gene-editing records of 157 microbial strains have been manually extracted from the literature in the Strain2Gms module. For strains without established CRISPR/Cas systems, the PredStrain2Gms module recommends the system of the most closely related strain as a reference to facilitate the construction of a new CRISPR/Cas gene-editing system. The database contains 139,499 records of strain cultivation and genome sequences, and 773,298 records of strain-related compounds. To facilitate simple and intuitive data application, all microbial strains are also labeled with stars based on the order and availability of strain information. SynBioStrainFinder provides a user-friendly interface for querying, browsing, and visualizing detailed information on microbial strains, and it is publicly available at http://design.rxnfinder.org/biosynstrain/.
Conclusion
SynBioStrainFinder is the first microbial strain database with manually curated information on the strain CRISPR/Cas system as well as other microbial strain information. It also provides reference information for the construction of new CRISPR/Cas systems. SynBioStrainFinder will serve as a useful resource to extend microbial strain research and application for biomanufacturing.
Summary
Living cell strains have important applications in synthesising their native compounds and potential for use in studies exploring the universal chemical space. Here, we present a web server named as Cell2Chem which accelerates the search for explored compounds in organisms, facilitating investigations of biosynthesis in unexplored chemical spaces. Cell2Chem employs co-occurrence networks and natural language processing to provide a systematic method for linking living organisms to biosynthesised compounds and the processes that produce these compounds. The Cell2Chem platform comprises 40,370 species and 125,212 compounds. Using reaction pathway and enzyme function in silico prediction methods, Cell2Chem reveals possible biosynthetic pathways of compounds and catalytic functions of proteins to expand unexplored biosynthetic chemical spaces. Cell2Chem can help improve biosynthesis research and enhance the efficiency of synthetic biology.
Availability and Implementation
Cell2Chem is available at: http://www.rxnfinder.org/cell2chem/
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.