Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): architecture, component evaluation and applications

Savova, Guergana; Masanz, James; Ogren, Philip V.; Zheng, Jiaping; Sohn, Sunghwan; Kipper-Schuler, Karin; Chute, Christopher G.

doi:10.1136/jamia.2009.001560

Cited by 1,481 publications

(1,015 citation statements)

References 29 publications

Supporting

Mentioning

1,007

Contrasting

Unclassified

Order By: Relevance

“…Then, the MITRE MIST tool (Aberdeen et al, 2010) and the Scrubber toolkit (McMurry, Fitch, Savova, Kohane, & Reis, 2013) in the Apache cTAKES NLP engine were used to erase Protected Health Information (PHI) elements from the text. Following de‐identification, the Apache cTAKES NLP engine (Savova et al, 2010) was deployed to extract knowledge by identifying occurrences of concepts defined in the Unified Medical Language System (UMLS) (Bodenreider, 2004) in the text. Apache cTAKES also identifies the context in which the concepts are mentioned in the sentence including negation, patient history, family history, and uncertainty.…”

Section: Methodsmentioning

confidence: 99%

Phelan‐McDermid syndrome data network: Integrating patient reported outcomes with clinical notes and curated genetic reports

Kothari

Wack

Hassen‐Khodja

et al. 2017

American J of Med Genetics Pt B

View full text Add to dashboard Cite

The heterogeneity of patient phenotype data are an impediment to the research into the origins and progression of neuropsychiatric disorders. This difficulty is compounded in the case of rare disorders such as Phelan‐McDermid Syndrome (PMS) by the paucity of patient clinical data. PMS is a rare syndromic genetic cause of autism and intellectual deficiency. In this paper, we describe the Phelan‐McDermid Syndrome Data Network (PMS_DN), a platform that facilitates research into phenotype–genotype correlation and progression of PMS by: a) integrating knowledge of patient phenotypes extracted from Patient Reported Outcomes (PRO) data and clinical notes—two heterogeneous, underutilized sources of knowledge about patient phenotypes—with curated genetic information from the same patient cohort and b) making this integrated knowledge, along with a suite of statistical tools, available free of charge to authorized investigators on a Web portal https://pmsdn.hms.harvard.edu. PMS_DN is a Patient Centric Outcomes Research Initiative (PCORI) where patients and their families are involved in all aspects of the management of patient data in driving research into PMS. To foster collaborative research, PMS_DN also makes patient aggregates from this knowledge available to authorized investigators using distributed research networks such as the PCORnet PopMedNet. PMS_DN is hosted on a scalable cloud based environment and complies with all patient data privacy regulations. As of October 31, 2016, PMS_DN integrates high‐quality knowledge extracted from the clinical notes of 112 patients and curated genetic reports of 176 patients with preprocessed PRO data from 415 patients.

show abstract

Section: Methodsmentioning

confidence: 99%

Phelan‐McDermid syndrome data network: Integrating patient reported outcomes with clinical notes and curated genetic reports

Kothari

Wack

Hassen‐Khodja

et al. 2017

American J of Med Genetics Pt B

View full text Add to dashboard Cite

show abstract

“…Meystre and Haug propose a NLP-based system to extract medical problems from electronic patient records [14]. Clinical NLP frameworks such as cTAKES, proposed by [17], use it for document indexing and retrieval. It has also given rise to automated semi-and fully-supervised annotation techniques and resources: It has inspired the annotation formats used to build clinical annotated corpora such as the CLEF corpus from [16].…”

Section: Related Workmentioning

confidence: 99%

Process Fragment Recognition in Clinical Documents

Thorne

Cardillo

Eccher

et al. 2013

AI*IA 2013: Advances in Artificial Intelligence

View full text Add to dashboard Cite

Abstract. We describe a first experiment on automated activity and relation identification, and more in general, on the automated identification and extraction of computer-interpretable guideline fragments from clinical documents. We rely on clinical entity and relation (activities, actors, artifacts and their relations) recognition techniques and use MetaMap and the UMLS Metathesaurus to provide lexical information. In particular, we study the impact of clinical document syntax and semantics on the precision of activity and temporal relation recognition.

show abstract

“…MENELAS can analyze reports in French, English and Dutch. cTAKES, a clinical Text Analysis and Knowledge Extraction System is introduced in [22]. cTAKES is an open-source NLP system that uses rule-based and machine learning techniques to process and extract information to support clinical research.…”

Section: Introductionmentioning

confidence: 99%

“…According to that publication, this extraction poses new challenges due to the problems mentioned before. The growth in the use of EHRs has generated a significant development in Medical Language Processing systems (MLP), information extraction techniques and applications [8][9][10][11][12][13][14][15][16][17][18][19][20][21][22][23].…”

Section: Introductionmentioning

confidence: 99%

Clinical Narrative Analytics Challenges

et al. 2016

View full text Add to dashboard Cite

Abstract. Precision medicine or evidence based medicine is based on the extraction of knowledge from medical records to provide individuals with the appropriate treatment in the appropriate moment according to the patient features. Despite the efforts of using clinical narratives for clinical decision support, many challenges have to be faced still today such as multilinguarity, diversity of terms and formats in different ser vices, acronyms, negation, to name but a few. The same problems exist when one wants to analyze narratives in literature whose analysis would provide physicians and researchers with highlights. In this talk we will analyze challenges, solutions and open problems and will analyze several frameworks and tools that are able to perform NLP over free text to extract medical entities by means of Named Entity Recognition process. We will also analyze a framework we have developed to extract and val idate medical terms. In particular we present two uses cases: (i) medical entities extraction of a set of infectious diseases description texts pro vided by MedlinePlus and (ii) scales of stroke identification in clinical narratives written in Spanish.

show abstract

Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): architecture, component evaluation and applications

Cited by 1,481 publications

References 29 publications

Phelan‐McDermid syndrome data network: Integrating patient reported outcomes with clinical notes and curated genetic reports

Phelan‐McDermid syndrome data network: Integrating patient reported outcomes with clinical notes and curated genetic reports

Process Fragment Recognition in Clinical Documents

Clinical Narrative Analytics Challenges

Contact Info

Product

Resources

About