Support Vector Feature Selection for Early Detection of Anastomosis Leakage From Bag-of-Words in Electronic Health Records

Soguero-Ruíz, Cristina; Hindberg, Kristian; Rojo‐Álvarez, José Luis; Skrøvseth, Stein Olav; Godtliebsen, Fred; Mortensen, Kim Erlend; Revhaug, Arthur; Lindsetmo, Rolv-Ole; Augestad, Knut Magne; Jenssen, Robert

doi:10.1109/jbhi.2014.2361688

Cited by 63 publications

(41 citation statements)

References 36 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…2014 described this problem, where the author’s condensed similar trajectories from structured diagnosis codes for the entire Danish population10. However, there are methods that have been found to be useful in exploring adverse drug effects, suicide risk, disease severity and patient stratification in EHR353637383940, these methods depends strongly on the availability of structured information41424344.…”

Section: Discussionmentioning

confidence: 99%

Analysis of free text in electronic health records for identification of cancer patient trajectories

Jensen

Soguero-Ruíz

Mikalsen

et al. 2017

Sci Rep

Self Cite

114

View full text Add to dashboard Cite

With an aging patient population and increasing complexity in patient disease trajectories, physicians are often met with complex patient histories from which clinical decisions must be made. Due to the increasing rate of adverse events and hospitals facing financial penalties for readmission, there has never been a greater need to enforce evidence-led medical decision-making using available health care data. In the present work, we studied a cohort of 7,741 patients, of whom 4,080 were diagnosed with cancer, surgically treated at a University Hospital in the years 2004–2012. We have developed a methodology that allows disease trajectories of the cancer patients to be estimated from free text in electronic health records (EHRs). By using these disease trajectories, we predict 80% of patient events ahead in time. By control of confounders from 8326 quantified events, we identified 557 events that constitute high subsequent risks (risk > 20%), including six events for cancer and seven events for metastasis. We believe that the presented methodology and findings could be used to improve clinical decision support and personalize trajectories, thereby decreasing adverse events and optimizing cancer treatment.

show abstract

Section: Discussionmentioning

confidence: 99%

Analysis of free text in electronic health records for identification of cancer patient trajectories

Jensen

Soguero-Ruíz

Mikalsen

et al. 2017

Sci Rep

Self Cite

114

View full text Add to dashboard Cite

show abstract

“…Designing and implementing a regular expression approach requires considerable manual work unless regular expression learning techniques are effectively applied (36, 37). Mapping text to UMLS concepts is one of the extraction methods commonly used in clinical and biomedical NLP studies (38-40). MetaMap tends to perform well in the recognition of texts that can be mapped to medical terms.…”

Section: Discussionmentioning

confidence: 99%

Extractive text summarization system to aid data extraction from full text in systematic review development

Bui

Fiol

Hurdle

et al. 2016

Journal of Biomedical Informatics

View full text Add to dashboard Cite

Objectives Extracting data from publication reports is a standard process in systematic review (SR) development. However, the data extraction process still relies too much on manual effort which is slow, costly, and subject to human error. In this study, we developed a text summarization system aimed at enhancing productivity and reducing errors in the traditional data extraction process. Methods We developed a computer system that used machine learning and natural language processing approaches to automatically generate summaries of full-text scientific publications. The summaries at the sentence and fragment levels were evaluated in finding common clinical SR data elements such as sample size, group size, and PICO values. We compared the computer-generated summaries with human written summaries (title and abstract) in terms of the presence of necessary information for the data extraction as presented in the Cochrane review’s study characteristics tables. Results At the sentence level, the computer-generated summaries covered more information than humans do for systematic reviews (recall 91.2% vs. 83.8%, p<0.001). They also had a better density of relevant sentences (precision 59% vs. 39%, p<0.001). At the fragment level, the ensemble approach combining rule-based, concept mapping, and dictionary-based methods performed better than individual methods alone, achieving an 84.7% F-measure. Conclusion Computer-generated summaries are potential alternative information sources for data extraction in systematic review development. Machine learning and natural language processing are promising approaches to the development of such an extractive summarization system.

show abstract

“…A bag-of-words (BoW) model was used to represent the presence or absence of each different word that appeared in the clinical narrative [20]. Stop words and words that appeared for fewer than five patients were removed.…”

Section: Classification Based On Specified Anchors 331 Feature Repmentioning

confidence: 99%

Using anchors from free text in electronic health records to diagnose postoperative delirium

Mikalsen

Soguero-Ruíz

Jensen

et al. 2017

Computer Methods and Programs in Biomedicine

Self Cite

View full text Add to dashboard Cite

Objectives. Postoperative delirium is a common complication after major surgery among the elderly. Despite its potentially serious consequences, the complication often goes undetected and undiagnosed. In order to provide diagnosis support one could potentially exploit the information hidden in free text documents from electronic health records using data-driven clinical decision support tools. However, these tools depend on labeled training data and can be both time consuming and expensive to create. Methods. The recent "Learning with Anchors" framework resolves this problem by transforming key observations (anchors) into labels. This is a promising framework, but it is heavily reliant on clinician's knowledge for specifying good anchor choices in order to perform well. In this paper we propose a novel method for specifying anchors from free text documents, following an exploratory data analysis approach based on clustering and data visualization techniques. We investigate the use of the new framework as a way to detect postoperative delirium. Results. By applying the proposed method to medical data gathered from a Norwegian University Hospital, we increase the area under the precision-recall curve from 0.51 to 0.96 compared to baselines. Conclusions. The proposed approach can be used as a framework for clinical decision support for postoperative delirium.

show abstract

Support Vector Feature Selection for Early Detection of Anastomosis Leakage From Bag-of-Words in Electronic Health Records

Cited by 63 publications

References 36 publications

Analysis of free text in electronic health records for identification of cancer patient trajectories

Analysis of free text in electronic health records for identification of cancer patient trajectories

Extractive text summarization system to aid data extraction from full text in systematic review development

Using anchors from free text in electronic health records to diagnose postoperative delirium

Contact Info

Product

Resources

About