2005
DOI: 10.1197/jamia.m1695
|View full text |Cite
|
Sign up to set email alerts
|

Improved Identification of Noun Phrases in Clinical Radiology Reports Using a High-Performance Statistical Natural Language Parser Augmented with the UMLS Specialist Lexicon

Abstract: A b s t r a c t Objective: The aim of this study was to develop and evaluate a method of extracting noun phrases with full phrase structures from a set of clinical radiology reports using natural language processing (NLP) and to investigate the effects of using the UMLSÒ Specialist Lexicon to improve noun phrase identification within clinical radiology documents.Design: The noun phrase identification (NPI) module is composed of a sentence boundary detector, a statistical natural language parser trained on a no… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

2
42
0

Year Published

2007
2007
2017
2017

Publication Types

Select...
6
2
1

Relationship

0
9

Authors

Journals

citations
Cited by 59 publications
(44 citation statements)
references
References 30 publications
2
42
0
Order By: Relevance
“…The Stanford parser has been used to provide syntactic clues for identifying key clinical terms in the medical domain [34] and gene and protein names in the biological domain [35], although we disagree with the latter paper that unlexicalised parsers – those that represent words simply by their POS tags – are more suited to the biological domain than lexicalised parsers equipped with a general-English lexicon. While the relative positions of the lexicalised and unlexicalised versions of the Stanford parser in our study depend on which evaluation measure is used, both versions were consistently out-performed by the Bikel and Charniak-Lease parsers, both of whose parsing engines are lexicalised with a general-English vocabulary.…”
Section: Resultsmentioning
confidence: 93%
“…The Stanford parser has been used to provide syntactic clues for identifying key clinical terms in the medical domain [34] and gene and protein names in the biological domain [35], although we disagree with the latter paper that unlexicalised parsers – those that represent words simply by their POS tags – are more suited to the biological domain than lexicalised parsers equipped with a general-English lexicon. While the relative positions of the lexicalised and unlexicalised versions of the Stanford parser in our study depend on which evaluation measure is used, both versions were consistently out-performed by the Bikel and Charniak-Lease parsers, both of whose parsing engines are lexicalised with a general-English vocabulary.…”
Section: Resultsmentioning
confidence: 93%
“…This can be done efficiently by using an appropriate data structure such as a hash table. Systems that use string matching techniques include SAPHIRE (Hersh and Hickam, 1995), IndexFinder (Zou et al, 2003), NIP (Huang et al, 2005) and MaxMatcher (Zhou et al, 2006). With a large lexicon, high precision and acceptable recall were achieved by this approach in their experiments.…”
Section: Related Workmentioning
confidence: 99%
“…The autocoder tools search through (parse) the text of the report to find keywords or phrases representing the procedure, organ, and diagnosis terms (both assertion and negation of these terms) and assign codes to these based on the UMLS vocabulary [15][16][17][18][19][20][21][22]. Use of codes facilitates searching as discussed below.…”
Section: Building the Local Pathology Databasementioning
confidence: 99%