2014
DOI: 10.1186/1471-2105-15-266
|View full text |Cite
|
Sign up to set email alerts
|

Natural language processing of radiology reports for the detection of thromboembolic diseases and clinically relevant incidental findings

Abstract: BackgroundNatural Language Processing (NLP) has been shown effective to analyze the content of radiology reports and identify diagnosis or patient characteristics. We evaluate the combination of NLP and machine learning to detect thromboembolic disease diagnosis and incidental clinically relevant findings from angiography and venography reports written in French. We model thromboembolic diagnosis and incidental findings as a set of concepts, modalities and relations between concepts that can be used as feature… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

2
84
0

Year Published

2016
2016
2023
2023

Publication Types

Select...
4
4
1

Relationship

0
9

Authors

Journals

citations
Cited by 95 publications
(86 citation statements)
references
References 31 publications
2
84
0
Order By: Relevance
“…In one study, accuracy for each of three classification tasks in thromboembolic diagnoses (presence, CT technique, and clinically relevant incidental findings) was uniformly increased regardless of the machine learning algorithm used (naïve Bayes model, support vector machine, or maximum entropy) when pattern matching was used to identify relevant concepts and their relationships (eg, "nodule" as a condition and "lingula" as an anatomic structure) (28). The authors of that study also used an NLP-based automated anonymization tool named MEDINA (MEDical Information Anonymization) to identify and replace patient and physician information and to shift dates by a uniform random number (28). Such use of NLP is of particular relevance to research because one could potentially develop a technique to preserve temporal information in the collective EMR data of each individual patient being de-identified.…”
Section: Pulmonary Embolismmentioning
confidence: 99%
See 1 more Smart Citation
“…In one study, accuracy for each of three classification tasks in thromboembolic diagnoses (presence, CT technique, and clinically relevant incidental findings) was uniformly increased regardless of the machine learning algorithm used (naïve Bayes model, support vector machine, or maximum entropy) when pattern matching was used to identify relevant concepts and their relationships (eg, "nodule" as a condition and "lingula" as an anatomic structure) (28). The authors of that study also used an NLP-based automated anonymization tool named MEDINA (MEDical Information Anonymization) to identify and replace patient and physician information and to shift dates by a uniform random number (28). Such use of NLP is of particular relevance to research because one could potentially develop a technique to preserve temporal information in the collective EMR data of each individual patient being de-identified.…”
Section: Pulmonary Embolismmentioning
confidence: 99%
“…Another powerful classification model that has become very popular in recent years is the support vector machine, which implicitly maps the features to a much higher dimensional space so as to derive many complex features automatically from the existing one, giving the model much better adaptivity. Both the maximum entropy and support vector machine models are often encountered in radiology NLP applications (22,24,(26)(27)(28)(29).…”
Section: Statistical and Machine Learning Approachesmentioning
confidence: 99%
“…Ni and colleagues used it to improve oncology trial eligibility screening [130], and Weng and Boland to represent and extract trial eligibility criteria [133,134]. Extracting information to improve treatment and follow-up of patients has been applied to pancreatic [135] and colon neoplasms detection [136], thromboembolism and incidental findings [137], adverse events and errors detection [137], and patients acuity prediction [138]. Finally, information extracted from unstructured clinical data has been used to enable other examples of data reuse discussed below.…”
Section: F Extraction Of Information From Unstructured Clinical Datamentioning
confidence: 99%
“…Pham et al developed an NLP pipeline to detect and classify mentions of thromboembolic disease from angiography and venography reports. They used naive Bayes' feature selection then support vector machines and maximum entropy for classification (Pham et al, 2014). Esuli et al developed two novel methods for extracting radiological findings from reports: a cascaded, twostage ensemble of taggers generated by linearchain conditional random fields (LC-CRFs) and a confidence-weighted ensemble method combining standard LC-CRFs and the two-stage method (Esuli et al, 2013).…”
Section: Related Workmentioning
confidence: 99%