2017
DOI: 10.1101/123299
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

CogStack - Experiences Of Deploying Integrated Information Retrieval And Extraction Services In A Large National Health Service Foundation Trust Hospital

Abstract: Background: Traditional health information systems are generally devised to support clinical data collection at the point of care. However, as the significance of the modern information economy expands in scope and permeates the healthcare domain, there is an increasing urgency for healthcare organisations to offer information systems that address the expectations of clinicians, researchers and the business intelligence community alike. Amongst other emergent requirements, the principal unmet need might be def… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
12
0

Year Published

2017
2017
2021
2021

Publication Types

Select...
6

Relationship

3
3

Authors

Journals

citations
Cited by 8 publications
(12 citation statements)
references
References 29 publications
0
12
0
Order By: Relevance
“…Two clinicians (blinded to the ICD-10 and OPCS-4 codes recorded) reviewed the entire hospital record (charts, referral letters, discharge letters, imaging reports) for 283 patient hospital episodes from two large NHS Trusts (University College London Hospitals NHS Foundation Trust and Kings College Hospital NHS Foundation Trust). The hospital record corpus (14,364,947 words in total) was made available as a single text files per patient, through the use of CogStack(39), method of enterprise-wide retrieval and extraction architecture for structured and unstructured information which integrates data across multiple EHR systems in a hospital. Patient consent for reviewing these records was provided from the NIHR funded SIGNUM study of stroke patients.…”
Section: Methodsmentioning
confidence: 99%
“…Two clinicians (blinded to the ICD-10 and OPCS-4 codes recorded) reviewed the entire hospital record (charts, referral letters, discharge letters, imaging reports) for 283 patient hospital episodes from two large NHS Trusts (University College London Hospitals NHS Foundation Trust and Kings College Hospital NHS Foundation Trust). The hospital record corpus (14,364,947 words in total) was made available as a single text files per patient, through the use of CogStack(39), method of enterprise-wide retrieval and extraction architecture for structured and unstructured information which integrates data across multiple EHR systems in a hospital. Patient consent for reviewing these records was provided from the NIHR funded SIGNUM study of stroke patients.…”
Section: Methodsmentioning
confidence: 99%
“…Bleeding assignments from the clinicians review was compared with those from the phenotyping algorithm and we estimated the PPV, NPV, sensitivity and specificity using the case review data as the "gold standard". We extracted hospital data (14,364,947 words) using CogStack [57] from the consented Stroke InvestiGation Network-Understanding Mechanisms (SIGNUM) study.…”
Section: ) Cross-ehr Source Concordancementioning
confidence: 99%
“…data retrieval, information extraction and semantic indexing. CogStack [14], a data harmonisation and enterprise search toolkit for EHRs, is adopted in the data retrieval step to provide a unified interface to unstructured EHR data, which is often very heterogeneous in format and distributed in storage. Each document that flows out from medical history, laboratory results); the continuous learning subsystem (to be described in next subsection) learns the contexts from user assessed annotations (see Supplementary Material 1 for detail).…”
Section: The Producing Subsystemmentioning
confidence: 99%
“…To realise a general-purpose biomedical information extraction (IE) system on EHRs, there are at least three fundamental challenges: a) syntactic heterogeneity: how to effectively access multi-modal/multisource EHR data that are almost certainly heterogeneous in formats, data models and access interfaces; b) knowledge coverage: how to cover all possible biomedical concepts that are required by potential use cases; c) context capturing: how to represent and capture the contexts associated with extracted concepts, and which are critical to understanding the clinical domain. To address these challenges, SemEHR architects a production infrastructure that integrates our previous work in the CogStack pipeline [14] to harmonise and cleanse heterogeneous records, using them to identify contextualised 4 mentions (negation, temporality and experiencer) of a wide range of biomedical concepts including SNOMED CT 1 , ICD-10 2 , LOINC 3 and Drug Ontology 4 . In addition, SemEHR automatically associates semantic types of annotations and their clinical contexts (derived from containing documents or sections) with dedicated extraction rules, which enables better IE capabilities such as populating the structured vital sign data from observation notes.…”
mentioning
confidence: 99%