2022
DOI: 10.48550/arxiv.2205.08890
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Analysing and strengthening OpenWPM's reliability

Abstract: Automated browsers are widely used to study the web at scale. Their premise is that they measure what regular browsers would encounter on the web. In practice, deviations due to detection of automation have been found. To what extent automated browsers can be improved to reduce such deviations has so far not been investigated in detail. In this paper, we investigate this for a specific web automation framework: OpenWPM, a popular research framework specifically designed to study web privacy. We analyse (1) det… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(2 citation statements)
references
References 22 publications
0
2
0
Order By: Relevance
“…Several studies, including Krumnow and Gossen et al [34,43] have improved OpenWPM's capabilities for crawling sites. However, if a site has a tracking script, it may detect the crawler and hinder its normal detection abilities, particularly for phishing sites.…”
Section: Page Cloakingmentioning
confidence: 99%
See 1 more Smart Citation
“…Several studies, including Krumnow and Gossen et al [34,43] have improved OpenWPM's capabilities for crawling sites. However, if a site has a tracking script, it may detect the crawler and hinder its normal detection abilities, particularly for phishing sites.…”
Section: Page Cloakingmentioning
confidence: 99%
“…Krummov et al [43] analyzed the reliability of OpenWPM and found that scripts detecting the presence of display and wrapper functions can detect Bot Detection on OpenWPM, and at least 16.7% of sites on Tranco Top 100k recognized Selenium and OpenWPM. Vastel et al [67] suggested that crawler detection can use fingerprinting.…”
Section: Bot Detectionmentioning
confidence: 99%