2017
DOI: 10.1016/j.jbi.2017.05.023
|View full text |Cite
|
Sign up to set email alerts
|

De-identification of clinical notes via recurrent neural network and conditional random field

Abstract: De-identification, identifying information from data, such as protected health information (PHI) present in clinical data, is a critical step to enable data to be shared or published. The 2016 Centers of Excellence in Genomic Science (CEGS) Neuropsychiatric Genome-scale and RDOC Individualized Domains (N-GRID) clinical natural language processing (NLP) challenge contains a de-identification track in de-identifying electronic medical records (EMRs) (i.e., track 1). The challenge organizers provide 1000 annotate… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

3
123
0

Year Published

2017
2017
2023
2023

Publication Types

Select...
3
3
3
1

Relationship

0
10

Authors

Journals

citations
Cited by 124 publications
(126 citation statements)
references
References 39 publications
3
123
0
Order By: Relevance
“…De-identification system Machine learning S1 (Zhao, Zhang, Ma, and Li (2018)), S2 (Chen, Cullen, and Godwin (2015)) S3 (Dernoncourt, Lee, Uzuner, and Szolovits (2017)), S4 (Yadav, Ekbal, Saha, Pathak, and Bhattacharyya (2017)), S5 ), S6 ) Hybrid S7 (Yang and Garibaldi (2015)) S8 (Liu, Tang, Wang, and Chen (2017)) S9 (Lee, Dernoncourt, Uzuner, and Szolovits (2016)) S10 (Dehghan, Kovacevic, Karystianis, Keane, and Nenadic (2015)) S11 (Yang and Garibaldi (2015)) S12 (He, Guan, Cheng, Cen, and Hua (2015)) S13 (Liu, Chen, Tang, Wang, Chen, Li, Wang, Deng, and Zhu (2015)) S14 (Phuong and Chau (2016)) S15 (Bui, Wyatt, and Cimino (2017a)) S16 (Jiang, Zhao, He, Guan, and Jiang (2017)) S17 (Lee, Wu, Zhang, Xu, Xu, and Roberts (2017)) S18 (Shweta, Kumar, Ekbal, Saha, and Bhattacharyya (2016)) In this section, we outline the most significant achievement of automating end-toend de-identification system: improving accuracy. It has been argued that as far as de-identification is concerned, perfection cannot be achieved; however, 95% accuracy is considered to be the rule of thumb and universally accepted value ; ).…”
Section: Architecturementioning
confidence: 99%
“…De-identification system Machine learning S1 (Zhao, Zhang, Ma, and Li (2018)), S2 (Chen, Cullen, and Godwin (2015)) S3 (Dernoncourt, Lee, Uzuner, and Szolovits (2017)), S4 (Yadav, Ekbal, Saha, Pathak, and Bhattacharyya (2017)), S5 ), S6 ) Hybrid S7 (Yang and Garibaldi (2015)) S8 (Liu, Tang, Wang, and Chen (2017)) S9 (Lee, Dernoncourt, Uzuner, and Szolovits (2016)) S10 (Dehghan, Kovacevic, Karystianis, Keane, and Nenadic (2015)) S11 (Yang and Garibaldi (2015)) S12 (He, Guan, Cheng, Cen, and Hua (2015)) S13 (Liu, Chen, Tang, Wang, Chen, Li, Wang, Deng, and Zhu (2015)) S14 (Phuong and Chau (2016)) S15 (Bui, Wyatt, and Cimino (2017a)) S16 (Jiang, Zhao, He, Guan, and Jiang (2017)) S17 (Lee, Wu, Zhang, Xu, Xu, and Roberts (2017)) S18 (Shweta, Kumar, Ekbal, Saha, and Bhattacharyya (2016)) In this section, we outline the most significant achievement of automating end-toend de-identification system: improving accuracy. It has been argued that as far as de-identification is concerned, perfection cannot be achieved; however, 95% accuracy is considered to be the rule of thumb and universally accepted value ; ).…”
Section: Architecturementioning
confidence: 99%
“…The M I2B2 model is tuned to achieve state-ofthe-art results on textual medical notes, such as in Dernoncourt et al (2017a); Liu et al (2017). It should be stressed that the model was used as is, without an attempt to adapt it to the domain of ASR output.…”
Section: Pipeline Modelsmentioning
confidence: 99%
“…Deep Learning (word2vec) [299] Research Articles [299] Depression DT [303], kNN [134,298], NN [295], Regression [294,296], RF [134], SVM [134], Linear Discriminant Analysis [134] Survey [296,303,304], Social Media [298], Electronic Health Records [295], Imaging [134,294], Biological [134,296] Healthy Ageing RF [304] Survey [304] Psychosis SVM, Multiple Kernel Learning [297] Imaging [297] Schizophrenia RF [291], SVM [291,293], Linear Discriminant Analysis [291], kNN [291] Insurance [291], Imaging [293] Substance Use Topic modelling [306] Interview [306] Symptom Severity NN [301] Clinical Notes [301] Wellbeing BN [302], SVM [302], Deep Learning (paragraph2vec) [300], NN [307] Clinical Notes [300,302] As an emerging field, there are understandably significant gaps for future research to address. It is evident that the majority of papers focus on diagnosis and detection, particularly on depression, suicide risk and cognitive decline.…”
Section: Technique(s) Data Typementioning
confidence: 99%