2009
DOI: 10.1109/msp.2009.183
|View full text |Cite
|
Sign up to set email alerts
|

The Rules of Redaction: Identify, Protect, Review (and Repeat)

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
37
0

Year Published

2012
2012
2024
2024

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 29 publications
(37 citation statements)
references
References 8 publications
0
37
0
Order By: Relevance
“…); and ii) mask these terms to minimize disclosure by means of an appropriate protection mechanism (e.g., removal, generalization, etc.). The community refers to the act of removing or blacking-out sensitive terms as redaction, whereas sanitization usually consists in coarsening them via generalization (e.g., AIDS can be replaced by a less detailed generalization such as disease) [3]. The latter approach, which we use in this paper, better preserves the utility of the output.…”
Section: Background On Plain Textual Data Protectionmentioning
confidence: 99%
See 1 more Smart Citation
“…); and ii) mask these terms to minimize disclosure by means of an appropriate protection mechanism (e.g., removal, generalization, etc.). The community refers to the act of removing or blacking-out sensitive terms as redaction, whereas sanitization usually consists in coarsening them via generalization (e.g., AIDS can be replaced by a less detailed generalization such as disease) [3]. The latter approach, which we use in this paper, better preserves the utility of the output.…”
Section: Background On Plain Textual Data Protectionmentioning
confidence: 99%
“…In particular, all references to Sexually Transmitted Diseases (STDs) or HIV status should be redacted or sanitized. To do so, terms explicitly referring to these diseases and those semantically related ones such as drugs, treatments or symptoms should be identified and protected [3].…”
Section: Empirical Analysismentioning
confidence: 99%
“…Regarding the former, U.S. federal laws on medical data privacy [35,36] mandate hospitals and healthcare organizations to protect any references made to STDs and HIV status in patient medical records before releasing them to, for example, insurance companies in response to Worker's Compensation or Motor Vehicle Accident claims. To do so, those terms explicitly referring to these diseases and those semantically related ones, such as treatments or symptoms, should be protected [19]. Likewise, the EU Data Protection Directive [38] states that the information related to the religion and sexual orientation of EU citizens should be protected in order to avoid possible discrimination.…”
Section: Evaluation Data and Case Studiesmentioning
confidence: 99%
“…Its goal is to mimic and, hence, automatize the reasoning of human sanitizers with regard to semantic inferences, disclosure analysis, and protection of textual documents. To achieve that, our proposal relies on an assessment and quantification of the data semantics that human experts usually consider in document sanitization (Bier et al., ; Gordon, ). Our proposal provides the following contributions over the state of the art: In comparison with available models (Anandan et al., ; Cumby & Ghani, ), which assume that all risky terms (sensitive entities or related terms) have been identified a priori, our proposal automatizes both the detection of terms that can disclose sensitive data via semantic inferences and their protection.…”
Section: Introductionmentioning
confidence: 99%
“…Our proposal provides the following contributions over the state of the art: In comparison with available models (Anandan et al., ; Cumby & Ghani, ), which assume that all risky terms (sensitive entities or related terms) have been identified a priori, our proposal automatizes both the detection of terms that can disclose sensitive data via semantic inferences and their protection. This relieves human sanitizers from manually identifying related terms, which has been identified as one of the most difficult and time‐consuming challenges (Bier et al., ; Gordon, ). To do so, our model considers, as human sanitizers do, the semantic relationships by which terms or combinations of terms appearing in a document would disclose sensitive information via semantic inferences.…”
Section: Introductionmentioning
confidence: 99%