The Comparative Toxicogenomics Database (CTD).

Mattingly, Carolyn J.; Colby, Glenn T.; Forrest, Jeffrey Yi‐Lin; Boyer, James L.

doi:10.1289/ehp.6028

Cited by 198 publications

(84 citation statements)

References 27 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…This approach presents challenges for transcripts that are expressed at very low levels, have significant tissue-or age-specific requirements, or are from species for which minimal sequencing has been done (Schwartz et al 2000). A new, publicly available resource, the Comparative Toxicogenomics Database (CTD; http://ctd.mdibl.org; Mattingly et al 2003Mattingly et al , 2004a, provides multiple alignment and phylogenetic analysis results with sequences from diverse organisms for biomedically significant genes and proteins. CTD provides access to data valuable for identifying homologous genomic sequences and confirming gene and gene feature predictions.…”

Section: The Contributions Of Marine and Freshwater Organisms To Compmentioning

confidence: 99%

Marine Organism Cell Biology and Regulatory Sequence Discoveryin Comparative Functional Genomics

et al. 2004

View full text Add to dashboard Cite

The use of bioinformatics to integrate phenotypic and genomic data from mammalian models is well established as a means of understanding human biology and disease. Beyond direct biomedical applications of these approaches in predicting structure-function relationships between coding sequences and protein activities, comparative studies also promote understanding of molecular evolution and the relationship between genomic sequence and morphological and physiological specialization. Recently recognized is the potential of comparative studies to identify functionally significant regulatory regions and to generate experimentally testable hypotheses that contribute to understanding mechanisms that regulate gene expression, including transcriptional activity, alternative splicing and transcript stability. Functional tests of hypotheses generated by computational approaches require experimentally tractable in vitro systems, including cell cultures. Comparative sequence analysis strategies that use genomic sequences from a variety of evolutionarily diverse organisms are critical for identifying conserved regulatory motifs in the 5¢-upstream, 3¢-downstream and introns of genes. Genomic sequences and gene orthologues in the first aquatic vertebrate and protovertebrate organisms to be fully sequenced (Fugu rubripes, Ciona intestinalis, Tetraodon nigroviridis, Danio rerio) as well as in the elasmobranchs, spiny dogfish shark (Squalus acanthias) and little skate (Raja erinacea), and marine invertebrate models such as the sea urchin (Strongylocentrotus purpuratus) are valuable in the prediction of putative genomic regulatory regions. Cell cultures have been derived for these and other model species. Data and tools resulting from these kinds of studies will contribute to understanding transcriptional regulation of biomedically important genes and provide new avenues for medical therapeutics and disease prevention.

show abstract

Section: The Contributions Of Marine and Freshwater Organisms To Compmentioning

confidence: 99%

Marine Organism Cell Biology and Regulatory Sequence Discoveryin Comparative Functional Genomics

et al. 2004

View full text Add to dashboard Cite

show abstract

“…Toxicology laboratories in the United Kingdom have recently developed a set of toxicology tailored microarrays known as ToxBlot and ToxBlot II, which are specifically for use in toxicogenomic assays (Pennie, 2000(Pennie, , 2002. Subset-specific microarray databases, such as the Comparative Toxicogenomics Database (CTD), which contains toxicogenomic microarray data, are also under development (Mattingly et al, 2003;Medlin, 2002). These types of specialty-tailored microarrays could become increasingly important for the initiation of hypothesis-driven research as we extrapolate information about the individual genes comprising the human genome.…”

Section: Global Transcriptional Analysis-dna Microarraysmentioning

confidence: 99%

“…The Comparative Toxicogenomics Database (CTD) has been launched in order to provide an annotated guide to toxicologically significant genes (Mattingly et al, 2003).…”

mentioning

confidence: 99%

Environmental Health Research in the Post-Genome Era: New Fields, New Challenges, and New Opportunities

Bower

Shi

2005

Journal of Toxicology and Environmental Health, Part B

View full text Add to dashboard Cite

The human genome sequence provides researchers with a genetic framework to eventually understand the relationships of gene-environment interactions. This wealth of information has led to the birth of several related areas of research, including proteomics, functional genomics, pharmacogenomics, and toxicogenomics. Developing techniques such as DNA/protein microarrays, small-interfering RNA (siRNA) applications, two-dimensional gel electrophoresis, and mass spectrometry in conjunction with advanced analysis software and the availability of Internet databases offers a powerful set of tools to investigate an individual's response to specific stimuli. This review summarizes these emerging scientific fields and techniques focusing specifically on their applications to the complexities of gene-environment interactions and their potential role in environ-mental biosecurity.

show abstract

“…Several well-known databases employ manual curation of biomedical literature to provide comprehensive coverage of such relationships in humans. Examples of these include OMIM [10], HGMD [11], Comparative Toxicogenomics Database (CTD) [12], GHR (http://ghr.nlm.nih.gov/) and UniProtKB [9]. Recent efforts in the direction of (semi-)automated approaches to facilitate database curation of genotype-phenotype relationships include extraction of sequence variation information from biomedical text.…”

Section: Introductionmentioning

confidence: 99%

Text Mining Genotype-Phenotype Relationships from Biomedical Literature for Database Curation and Precision Medicine

Singhal

Simmons

2016

PLoS Comput Biol

View full text Add to dashboard Cite

The practice of precision medicine will ultimately require databases of genes and mutations for healthcare providers to reference in order to understand the clinical implications of each patient’s genetic makeup. Although the highest quality databases require manual curation, text mining tools can facilitate the curation process, increasing accuracy, coverage, and productivity. However, to date there are no available text mining tools that offer high-accuracy performance for extracting such triplets from biomedical literature. In this paper we propose a high-performance machine learning approach to automate the extraction of disease-gene-variant triplets from biomedical literature. Our approach is unique because we identify the genes and protein products associated with each mutation from not just the local text content, but from a global context as well (from the Internet and from all literature in PubMed). Our approach also incorporates protein sequence validation and disease association using a novel text-mining-based machine learning approach. We extract disease-gene-variant triplets from all abstracts in PubMed related to a set of ten important diseases (breast cancer, prostate cancer, pancreatic cancer, lung cancer, acute myeloid leukemia, Alzheimer’s disease, hemochromatosis, age-related macular degeneration (AMD), diabetes mellitus, and cystic fibrosis). We then evaluate our approach in two ways: (1) a direct comparison with the state of the art using benchmark datasets; (2) a validation study comparing the results of our approach with entries in a popular human-curated database (UniProt) for each of the previously mentioned diseases. In the benchmark comparison, our full approach achieves a 28% improvement in F1-measure (from 0.62 to 0.79) over the state-of-the-art results. For the validation study with UniProt Knowledgebase (KB), we present a thorough analysis of the results and errors. Across all diseases, our approach returned 272 triplets (disease-gene-variant) that overlapped with entries in UniProt and 5,384 triplets without overlap in UniProt. Analysis of the overlapping triplets and of a stratified sample of the non-overlapping triplets revealed accuracies of 93% and 80% for the respective categories (cumulative accuracy, 77%). We conclude that our process represents an important and broadly applicable improvement to the state of the art for curation of disease-gene-variant relationships.

show abstract

The Comparative Toxicogenomics Database (CTD).

Cited by 198 publications

References 27 publications

Marine Organism Cell Biology and Regulatory Sequence Discoveryin Comparative Functional Genomics

Marine Organism Cell Biology and Regulatory Sequence Discoveryin Comparative Functional Genomics

Environmental Health Research in the Post-Genome Era: New Fields, New Challenges, and New Opportunities

Text Mining Genotype-Phenotype Relationships from Biomedical Literature for Database Curation and Precision Medicine

Contact Info

Product

Resources

About