2017
DOI: 10.1186/s12911-017-0437-1
|View full text |Cite
|
Sign up to set email alerts
|

Clinical records anonymisation and text extraction (CRATE): an open-source software system

Abstract: BackgroundElectronic medical records contain information of value for research, but contain identifiable and often highly sensitive confidential information. Patient-identifiable information cannot in general be shared outside clinical care teams without explicit consent, but anonymisation/de-identification allows research uses of clinical data without explicit consent.ResultsThis article presents CRATE (Clinical Records Anonymisation and Text Extraction), an open-source software system with separable function… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
35
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
7
2

Relationship

5
4

Authors

Journals

citations
Cited by 41 publications
(35 citation statements)
references
References 15 publications
0
35
0
Order By: Relevance
“…Clinician-recorded diagnoses were present in coded data. Medicine information was extracted from free text using GATE-based natural language processing (NLP) software (15). The diseases we focused on included dementia (recorded with ICD-10 codes F00-F03 and G30, or indicated by mentions of cholinesterase inhibitors or glutamate receptor antagonists), substance misuse (F10-F19), severe (serious) mental illness (F20-F29, F30, and F31, or taking antipsychotics), depression (F32 or F33, or taking antidepressants), anxiety (F41 or F42), reaction to severe stress/adjustment disorders (F43), eating disorders (F50), personality disorders (F60-F69), intellectual disability (F70-F79), intentional self-harm (X60-X84), diabetes mellitus (E10-E14, or taking hypoglycemic agents), hypertension or cardiovascular or cerebrovascular disease (I10-I13, I15, I21-I25, and I60-I69, or taking ACE inhibitors, angiotensin-II receptor antagonists, beta blockers, calcium channel antagonists, or diuretics), dyslipidemia (E78, or taking lipid-lowering medications), asthma or chronic obstructive pulmonary disease (COPD) (J44, J45, or taking oral or inhaled corticosteroids, bronchodilators, or antiinflammatory drugs used for airways disease, accepting that oral corticosteroids may also indicate other inflammatory disorders), and cancer (C00-C97, or taking drugs specifically licensed for cancer).…”
Section: Variablesmentioning
confidence: 99%
“…Clinician-recorded diagnoses were present in coded data. Medicine information was extracted from free text using GATE-based natural language processing (NLP) software (15). The diseases we focused on included dementia (recorded with ICD-10 codes F00-F03 and G30, or indicated by mentions of cholinesterase inhibitors or glutamate receptor antagonists), substance misuse (F10-F19), severe (serious) mental illness (F20-F29, F30, and F31, or taking antipsychotics), depression (F32 or F33, or taking antidepressants), anxiety (F41 or F42), reaction to severe stress/adjustment disorders (F43), eating disorders (F50), personality disorders (F60-F69), intellectual disability (F70-F79), intentional self-harm (X60-X84), diabetes mellitus (E10-E14, or taking hypoglycemic agents), hypertension or cardiovascular or cerebrovascular disease (I10-I13, I15, I21-I25, and I60-I69, or taking ACE inhibitors, angiotensin-II receptor antagonists, beta blockers, calcium channel antagonists, or diuretics), dyslipidemia (E78, or taking lipid-lowering medications), asthma or chronic obstructive pulmonary disease (COPD) (J44, J45, or taking oral or inhaled corticosteroids, bronchodilators, or antiinflammatory drugs used for airways disease, accepting that oral corticosteroids may also indicate other inflammatory disorders), and cancer (C00-C97, or taking drugs specifically licensed for cancer).…”
Section: Variablesmentioning
confidence: 99%
“…Privacy protection is a critical issue in clinical data sharing for both research and clinical practices, and privacy violations often incur legal problems with substantial consequences. The privacy component embedded in the infrastructure offers technical solutions to deidentify or anonymize patient-level data, such as CRATE [26] and DEDUCE [27]. CRATE is an open-source software system that anonymizes an electronic health records database to create a research database with anonymized patients' data.…”
Section: Resultsmentioning
confidence: 99%
“…First of all, practical implementation requires a more thorough privacy and security component. The privacy and security component is part of the proposed architecture and currently implemented in the prototype using CRATE [26] and HTTPS. However, since only anonymized datasets are used in the evaluation, the deidentification toolkit, CRATE, was not validated.…”
Section: Discussionmentioning
confidence: 99%
“…In addition, there are specific psychiatric databases being developed for researchers with some attention paid to the sensitivity of such data and the development of highly sophisticated systems to protect anonymity while also allowing researcher access. 13 There is however no consistency in the kinds of psychiatric databases which currently exist which is arguably ethically and practically problematic. In a 'snapshot' overview of mental health databases available globally, there were no definable boundaries in BD sets in mental health with some generic health databases containing mental health information and with others more specialised and related solely to mental health conditions.…”
Section: Bd and Mental Health Researchmentioning
confidence: 99%