2017
DOI: 10.1016/j.jbi.2017.05.001
|View full text |Cite
|
Sign up to set email alerts
|

The UAB Informatics Institute and 2016 CEGS N-GRID de-identification shared task challenge

Abstract: Clinical narratives (the text notes found in patients’ medical records) are important information sources for secondary use in research. However, in order to protect patient privacy, they must be de-identified prior to use. Manual de-identification is considered to be the gold standard approach but is tedious, expensive, slow, and impractical for use with large-scale clinical data. Automated or semi-automated de-identification using computer algorithms is a potentially promising alternative. The Informatics In… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
15
0

Year Published

2017
2017
2024
2024

Publication Types

Select...
8
1

Relationship

1
8

Authors

Journals

citations
Cited by 14 publications
(15 citation statements)
references
References 24 publications
0
15
0
Order By: Relevance
“…De-identification system Machine learning S1 (Zhao, Zhang, Ma, and Li (2018)), S2 (Chen, Cullen, and Godwin (2015)) S3 (Dernoncourt, Lee, Uzuner, and Szolovits (2017)), S4 (Yadav, Ekbal, Saha, Pathak, and Bhattacharyya (2017)), S5 ), S6 ) Hybrid S7 (Yang and Garibaldi (2015)) S8 (Liu, Tang, Wang, and Chen (2017)) S9 (Lee, Dernoncourt, Uzuner, and Szolovits (2016)) S10 (Dehghan, Kovacevic, Karystianis, Keane, and Nenadic (2015)) S11 (Yang and Garibaldi (2015)) S12 (He, Guan, Cheng, Cen, and Hua (2015)) S13 (Liu, Chen, Tang, Wang, Chen, Li, Wang, Deng, and Zhu (2015)) S14 (Phuong and Chau (2016)) S15 (Bui, Wyatt, and Cimino (2017a)) S16 (Jiang, Zhao, He, Guan, and Jiang (2017)) S17 (Lee, Wu, Zhang, Xu, Xu, and Roberts (2017)) S18 (Shweta, Kumar, Ekbal, Saha, and Bhattacharyya (2016)) In this section, we outline the most significant achievement of automating end-toend de-identification system: improving accuracy. It has been argued that as far as de-identification is concerned, perfection cannot be achieved; however, 95% accuracy is considered to be the rule of thumb and universally accepted value ; ).…”
Section: Architecturementioning
confidence: 99%
See 1 more Smart Citation
“…De-identification system Machine learning S1 (Zhao, Zhang, Ma, and Li (2018)), S2 (Chen, Cullen, and Godwin (2015)) S3 (Dernoncourt, Lee, Uzuner, and Szolovits (2017)), S4 (Yadav, Ekbal, Saha, Pathak, and Bhattacharyya (2017)), S5 ), S6 ) Hybrid S7 (Yang and Garibaldi (2015)) S8 (Liu, Tang, Wang, and Chen (2017)) S9 (Lee, Dernoncourt, Uzuner, and Szolovits (2016)) S10 (Dehghan, Kovacevic, Karystianis, Keane, and Nenadic (2015)) S11 (Yang and Garibaldi (2015)) S12 (He, Guan, Cheng, Cen, and Hua (2015)) S13 (Liu, Chen, Tang, Wang, Chen, Li, Wang, Deng, and Zhu (2015)) S14 (Phuong and Chau (2016)) S15 (Bui, Wyatt, and Cimino (2017a)) S16 (Jiang, Zhao, He, Guan, and Jiang (2017)) S17 (Lee, Wu, Zhang, Xu, Xu, and Roberts (2017)) S18 (Shweta, Kumar, Ekbal, Saha, and Bhattacharyya (2016)) In this section, we outline the most significant achievement of automating end-toend de-identification system: improving accuracy. It has been argued that as far as de-identification is concerned, perfection cannot be achieved; however, 95% accuracy is considered to be the rule of thumb and universally accepted value ; ).…”
Section: Architecturementioning
confidence: 99%
“…Unlike other medical data, such as that of the 2014 challenge, psychiatric data contains an abundance of information related to the patients such as places lived, jobs held, children's ages, hobbies, traumatic events, patients' relatives' relationship information, and pet names. This makes it a much more significant challenge to deidentify (Bui, Wyatt, and Cimino (2017b); ).…”
Section: Overview Of Datasetsmentioning
confidence: 99%
“…In the 2016 i2b2 shared task, ensemble with rule-based models became more popular. Lee et al [12], Dehghan et al [13], Bui et al [14], and Liu et al [15] all employed rule-based models as a component of their hybrid systems. However, despite the wide use of rules, all the works did not investigate the effect of rule-based models in hybrid architecture.…”
Section: Prior Workmentioning
confidence: 99%
“…De-identification of EHR can be done manually or automatically. Manual deidentification is time consuming, tedious, costly, can only be done by a restricted set of individuals allowed access to the original patient notes, and is subject to human error (Bui, Wyatt, and Cimino (2017); Dehghan, Kovacevic, Karystianis, Keane, and Nenadic (2015); Dernoncourt, Lee, Uzuner, and Szolovits (2017); He, Guan, Cheng, Cen, and Hua (2015)). There have been many attempts to develop an automatic system that will de-identify EHR with certainty.…”
Section: Introductionmentioning
confidence: 99%