The lack of publicly accessible text corpora is a major obstacle for progress in natural language processing. For medical applications, unfortunately, all language communities other than English are low-resourced. In this work, we present GGPONC (German Guideline Program in Oncology NLP Corpus), a freely distributable German language corpus based on clinical practice guidelines for oncology. This corpus is one of the largest ever built from German medical documents. Unlike clinical documents, clinical guidelines do not contain any patient-related information and can therefore be used without data protection restrictions. Moreover, GGPONC is the first corpus for the German language covering diverse conditions in a large medical subfield and provides a variety of metadata, such as literature references and evidence levels. By applying and evaluating existing medical information extraction pipelines for German text, we are able to draw comparisons for the use of medical language to other corpora, medical and non-medical ones.
Objective
To determine the intrarater, interrater, and retest reliability of facial nerve grading of patients with facial palsy (FP) using standardized videos recorded synchronously during a self‐explanatory patient video tutorial.
Study Design
Prospective, observational study.
Methods
The automated videos from 10 patients with varying degrees of FP (5 acute, 5 chronic FP) and videos without tutorial from eight patients (all chronic FP) were rated by five novices and five experts according to the House‐Brackmann grading system (HB), the Sunnybrook Grading System (SB), and the Facial Nerve Grading System 2.0 (FNGS 2.0).
Results
Intrarater reliability for the three grading systems was very high using the automated videos (intraclass correlation coefficient [ICC]; SB: ICC = 0.967; FNGS 2.0: ICC = 0.931; HB: ICC = 0.931). Interrater reliability was also high (SB: ICC = 0.921; FNGS 2.0: ICC = 0.837; HB: ICC = 0.736), but for HB Fleiss kappa (0.214) and Kendell W (0.231) was low. The interrater reliability was not different between novices and experts. Retest reliability was very high (SB: novices ICC = 0.979; experts ICC = 0.964; FNGS 2.0: novices ICC = 0.979; experts ICC = 0.969). The reliability of grading of chronic FP with SB was higher using automated videos with tutorial (ICC = 0.845) than without tutorial (ICC = 0.538).
Conclusion
The reliability of the grading using the automated videos is excellent, especially for the SB grading. We recommend using this automated video tool regularly in clinical routine and for clinical studies.
Level of Evidence
4
xsLaryngoscope, 129:2274–2279, 2019
Automated identification of advanced chronic kidney disease (CKD ≥ III) and of no known kidney disease (NKD) can support both clinicians and researchers. We hypothesized that identification of CKD and NKD can be improved, by combining information from different electronic health record (EHR) resources, comprising laboratory values, discharge summaries and ICD-10 billing codes, compared to using each component alone. We included EHRs from 785 elderly multimorbid patients, hospitalized between 2010 and 2015, that were divided into a training and a test (n = 156) dataset. We used both the area under the receiver operating characteristic (AUROC) and under the precision-recall curve (AUCPR) with a 95% confidence interval for evaluation of different classification models. In the test dataset, the combination of EHR components as a simple classifier identified CKD ≥ III (AUROC 0.96[0.93–0.98]) and NKD (AUROC 0.94[0.91–0.97]) better than laboratory values (AUROC CKD 0.85[0.79–0.90], NKD 0.91[0.87–0.94]), discharge summaries (AUROC CKD 0.87[0.82–0.92], NKD 0.84[0.79–0.89]) or ICD-10 billing codes (AUROC CKD 0.85[0.80–0.91], NKD 0.77[0.72–0.83]) alone. Logistic regression and machine learning models improved recognition of CKD ≥ III compared to the simple classifier if only laboratory values were used (AUROC 0.96[0.92–0.99] vs. 0.86[0.81–0.91], p < 0.05) and improved recognition of NKD if information from previous hospital stays was used (AUROC 0.99[0.98–1.00] vs. 0.95[0.92–0.97]], p < 0.05). Depending on the availability of data, correct automated identification of CKD ≥ III and NKD from EHRs can be improved by generating classification models based on the combination of different EHR components.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.