2023
DOI: 10.56553/popets-2023-0069
|View full text |Cite
|
Sign up to set email alerts
|

Story Beyond the Eye: Glyph Positions Break PDF Text Redaction

Abstract: In this work we find that many current redactions of PDF text are insecure due to non-redacted character positioning information. In particular, subpixel-sized horizontal shifts in redacted and non-redacted characters can be recovered and used to effectively deredact first and last names. Unfortunately these findings affect redactions where the text underneath the black box is removed from the PDF. We demonstrate these findings by performing a comprehensive vulnerability assessment of common PDF redaction typ… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(1 citation statement)
references
References 6 publications
0
1
0
Order By: Relevance
“…First, it seems that redaction is a weak way to protect secrets. Redaction techniques are, in general, susceptible to textual analysis or even to simple copy-paste removal of superimposed blanking (Bland et al, 2023;Ingram, 2019). Redactions applied to historical cryptological papers are vulnerable to techniques which would be recognised by codebreakers of the period from which they originate: contextual analysis allowing linguistic interpolation; parallel availability of the same text in a different communication; word length analysis; and fingerprinting.…”
Section: Discussionmentioning
confidence: 99%
“…First, it seems that redaction is a weak way to protect secrets. Redaction techniques are, in general, susceptible to textual analysis or even to simple copy-paste removal of superimposed blanking (Bland et al, 2023;Ingram, 2019). Redactions applied to historical cryptological papers are vulnerable to techniques which would be recognised by codebreakers of the period from which they originate: contextual analysis allowing linguistic interpolation; parallel availability of the same text in a different communication; word length analysis; and fingerprinting.…”
Section: Discussionmentioning
confidence: 99%