The National Genomics Data Center (NGDC), part of the China National Center for Bioinformation (CNCB), provides a family of database resources to support global academic and industrial communities. With the explosive accumulation of multi-omics data generated at an unprecedented rate, CNCB-NGDC constantly expands and updates core database resources by big data archive, integrative analysis and value-added curation. In the past year, efforts have been devoted to integrating multiple omics data, synthesizing the growing knowledge, developing new resources and upgrading a set of major resources. Particularly, several database resources are newly developed for infectious diseases and microbiology (MPoxVR, KGCoV, ProPan), cancer-trait association (ASCancer Atlas, TWAS Atlas, Brain Catalog, CCAS) as well as tropical plants (TCOD). Importantly, given the global health threat caused by monkeypox virus and SARS-CoV-2, CNCB-NGDC has newly constructed the monkeypox virus resource, along with frequent updates of SARS-CoV-2 genome sequences, variants as well as haplotypes. All the resources and services are publicly accessible at https://ngdc.cncb.ac.cn.
Background Interstitial lung diseases (ILDs), a diverse group of diffuse lung diseases, mainly affect the lung parenchyma. The low-throughput ‘omics’ technologies (genomics, transcriptomics, proteomics) and relative drug information have begun to reshaped our understanding of ILDs, whereas, these data are scattered among massive references and are difficult to be fully exploited. Therefore, we manually mined and summarized these data at a database (ILDGDB, http://ildgdb.org/) and will continue to update it in the future. Main body The current version of ILDGDB incorporates 2018 entries representing 20 ILDs and over 600 genes obtained from over 3000 articles in four species. Each entry contains detailed information, including species, disease type, detailed description of gene (e.g. official symbol of gene), and the original reference etc. ILDGDB is free, and provides a user-friendly web page. Users can easily search for genes of interest, view their expression pattern and detailed information, manage genes sets and submit novel ILDs-gene association. Conclusion The main principle behind ILDGDB’s design is to provide an exploratory platform, with minimum filtering and interpretation, while making the presentation of the data very accessible, which will provide great help for researchers to decipher gene mechanisms and improve the prevention, diagnosis and therapy of ILDs.
Homology is fundamental to infer genes’ evolutionary processes and relationships with shared ancestry. Existing homolog gene resources vary in terms of inferring methods, homologous relationship and identifiers, posing inevitable difficulties for choosing and mapping homology results from one to another. Here, we present HGD (Homologous Gene Database, https://ngdc.cncb.ac.cn/hgd), a comprehensive homologs resource integrating multi-species, multi-resources and multi-omics, as a complement to existing resources providing public and one-stop data service. Currently, HGD houses a total of 112 383 644 homologous pairs for 37 species, including 19 animals, 16 plants and 2 microorganisms. Meanwhile, HGD integrates various annotations from public resources, including 16 909 homologs with traits, 276 670 homologs with variants, 398 573 homologs with expression and 536 852 homologs with gene ontology (GO) annotations. HGD provides a wide range of omics gene function annotations to help users gain a deeper understanding of gene function.
Convolutional neural network (CNN) has been widely used for fine-grained image classification, which has proven to be an effective approach for the classification and identification of specific species. For breed classification of dog, there are several proposed methods based on dog images, however, the highest accuracy rate for dogs (about 93%) is still below expectations compared to other animals or plants (more than 95% on birds and more than 97% on flowers). In this study, we used the Stanford Dog Dataset, combined image features from four CNN models, filtered the features using principal component analysis (PCA) and gray wolf optimization algorithm (GWO), and then classified the features with support vector machine (SVM). Eventually, the classification accuracy rate reached 95.24% for 120 breeds and 99.34% for 76 selected breeds, respectively, demonstrating a significant improvement over existing methods using the same Stanford Dog Dataset. It is expected that our proposed method will further serve as a fundamental framework for accurate classification of a wider range of species.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.