2021
DOI: 10.32473/flairs.v34i1.128480
|View full text |Cite
|
Sign up to set email alerts
|

De-identification of Emergency Medical Records in French: Survey and Comparison of State-of-the-Art Automated Systems

Abstract: In France, structured data from emergency room (ER) visits are aggregated at the national level to build a syndromic surveillance system for several health events. For visits motivated by a traumatic event, information on the causes are stored in free-text clinical notes. To exploit these data, an automated de-identification system guaranteeing protection of privacy is required.In this study we review available de-identification tools to de-identify free-text clinical documents in French. A key point is how to… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
13
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
4
1

Relationship

1
4

Authors

Journals

citations
Cited by 5 publications
(13 citation statements)
references
References 13 publications
0
13
0
Order By: Relevance
“…In this work, we address this challenge by processing more than 58 document types. Furthermore, while previous work on French clinical deidentification annotates their corpus manually ( [4], [9]), our approach uses distant supervision, which reduces both the cost and time required for annotation.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…In this work, we address this challenge by processing more than 58 document types. Furthermore, while previous work on French clinical deidentification annotates their corpus manually ( [4], [9]), our approach uses distant supervision, which reduces both the cost and time required for annotation.…”
Section: Discussionmentioning
confidence: 99%
“…To reuse such records and conduct health data-related studies, the task of deidentification has become essential ( [4], [5], [6]). This is necessary to protect the confidentiality of personal data in EHRs and comply with government regulations set in our case by the French Data Protection Authority, Commission Nationale de l'Informatique et des Libertés -(CNIL) 1 , and the General Data Protection Regulation -(GDPR) 2 .…”
Section: Introductionmentioning
confidence: 99%
“…Although our own dataset contains more than 3,600 annotated documents, our experiments with varying the size of the training set led us to the conclusion that excellent performance is achieved as early as 500 annotated documents, and that performance stops increasing significantly beyond 1,000 documents (►Fig. 6).…”
Section: Size Of the Training Datasetmentioning
confidence: 99%
“…A lot of work has been done on this topic, in several languages, [1][2][3] including French. [4][5][6][7] Different scenarios have been proposed to improve the processing of this task. 4,8 Yet, there is no consensus method or protocol in the community, and more importantly it is very difficult for new actors to benefit from the experience and tools implemented by others, for several reasons.…”
Section: Introductionmentioning
confidence: 99%
“…To reuse such records and conduct health data-related studies, the task of de-identification has become essential [4][5][6]. This is necessary to protect the confidentiality of personal data in EHRs and comply with government regulations set in our case by the French Data Protection Authority, Commission Nationale de l'Informatique et des Libertés-(CNIL), 1 and the General Data Protection Regulation-(GDPR).…”
Section: Introductionmentioning
confidence: 99%