Background
The aim of this study was to build electronic algorithms using a combination of structured data and natural language processing (NLP) of text notes for potential safety surveillance of nine post-operative complications.
Methods
Post-operative complications from six medical centers in the Southeastern United States were obtained from the Veterans Affairs Surgical Quality Improvement Program (VASQIP) registry. Development and test datasets were constructed using stratification by facility and date of procedure for patients with and without complication. Algorithms were developed from VASQIP outcome definitions using NLP coded concepts, regular expressions, and structured data. The VASQIP nurse reviewer served as the reference standard for evaluating sensitivity and specificity. The algorithms were designed in the development and evaluated in the test dataset.
Results
Sensitivity and specificity in the test set were 85% and 92% for acute renal failure, 80% and 93% for sepsis, 56% and 94% for deep vein thrombosis, 80% and 97% for pulmonary embolism, 88% and 89% for acute myocardial infarction, 88% and 92% for cardiac arrest, 80% and 90% for pneumonia, 95% and 80% for urinary tract infection, and 80% and 93% for wound infection, respectively. A third of the complications occurred outside of the hospital setting.
Conclusions
Computer algorithms on data extracted from the electronic health record produced respectable sensitivity and specificity across a large sample of patients seen in six different medical centers. This study demonstrates the utility of combining natural language processing with structured data for mining the information contained within the electronic health record.