Genomic data have demonstrated considerable traction in accelerating contemporary studies in traditional medicine. However, the lack of a uniform format and dispersed storage limits the full potential of herb genomic data. In this study, we developed a Global Pharmacopoeia Genome Database (GPGD). The database contains 34,346 records for 903 herb species from eight global pharmacopoeias (Brazilian, Egyptian, European, Indian, Japanese, Korean, the Pharmacopoeia of the People’s Republic of China, and U.S. Pharmacopoeia’s Herbal Medicines Compendium). In particular, the GPGD contains 21,872 DNA barcodes from 867 species, 2,203 organelle genomes from 674 species, 55 whole genomes from 49 species, 534 genomic sequencing datasets from 366 species, and 9,682 transcriptome datasets from 350 species. Among the organelle genomes, 534 genomes from 366 species were newly generated in this study. Whole genomes, organelle genomes, genomic fragments, transcriptomes, and DNA barcodes were uniformly formatted and arranged by species. The GPGD is publicly accessible at
http://www.gpgenome.com
and serves as an essential resource for species identification, decomposition of biosynthetic pathways, and molecular-assisted breeding analysis. Thus, the database is an invaluable resource for future studies on herbal medicine safety, drug discovery, and the protection and rational use of herbal resources.
Supporting Information
The supporting information is available online at 10.1007/s11427-021-1968-7. The supporting materials are published as submitted, without typesetting or editing. The responsibility for scientific accuracy and content remains entirely with the authors.