The NCI Thesaurus is a reference terminology covering areas of basic and clinical
science, built with the goal of facilitating translational research in cancer. It contains
nearly 110 000 terms in approximately 36000 concepts, partitioned in 20 subdomains,
which include diseases, drugs, anatomy, genes, gene products, techniques,
and biological processes, among others, all with a cancer-centric focus in content, and
originally designed to support coding activities across the National Cancer Institute.
Each concept represents a unit of meaning and contains a number of annotations, such
as synonyms and preferred name, as well as annotations such as textual definitions
and optional references to external authorities. In addition, concepts are modelled
with description logic (DL) and defined by their relationships to other concepts;
there are currently approximately 90 types of named relations declared in the
terminology. The NCI Thesaurus is produced by the Enterprise Vocabulary Services
project, a collaborative effort between the NCI Center for Bioinformatics and the
NCI Office of Communications, and is part of the caCORE infrastructure stack
(http://ncicb.nci.nih.gov/NCICB/core). It can be accessed programmatically through
the open caBIO API and browsed via the web (http://nciterms.nci.nih.gov). A history
of editing changes is also accessible through the API. In addition, the Thesaurus is
available for download in various file formats, including OWL, the web ontology
language, to facilitate its utilization by others.