Dark patterns are user interface design choices that benefit an online service by coercing, steering, or deceiving users into making unintended and potentially harmful decisions. We present automated techniques that enable experts to identify dark patterns on a large set of websites. Using these techniques, we study shopping websites, which often use dark patterns to influence users into making more purchases or disclosing more information than they would otherwise. Analyzing ∼53K product pages from ∼11K shopping websites, we discover 1,818 dark pattern instances, together representing 15 types and 7 broader categories. We examine these dark patterns for deceptive practices, and find 183 websites that engage in such practices. We also uncover 22 third-party entities that offer dark patterns as a turnkey solution. Finally, we develop a taxonomy of dark pattern characteristics that describes the underlying influence of the dark patterns and their potential harm on user decision-making. Based on our findings, we make recommendations for stakeholders including researchers and regulators to study, mitigate, and minimize the use of these patterns.
Automated analysis of privacy policies has proved a fruitful research direction, with developments such as automated policy summarization, question answering systems, and compliance detection. Prior research has been limited to analysis of privacy policies from a single point in time or from short spans of time, as researchers did not have access to a large-scale, longitudinal, curated dataset. To address this gap, we developed a crawler that discovers, downloads, and extracts archived privacy policies from the Internet Archive's Wayback Machine. Using the crawler and following a series of validation and quality control steps, we curated a dataset of 1,071,488 English language privacy policies, spanning over two decades and over 130,000 distinct websites.Our analyses of the data paint a troubling picture of the transparency and accessibility of privacy policies. By comparing the occurrence of tracking-related terminology in our dataset to prior web privacy measurements, we find that privacy policies have consistently failed to disclose the presence of common tracking technologies and third parties. We also find that over the last twenty years privacy policies have become even more difficult to read, doubling in length and increasing a full grade in the median reading level. Our data indicate that self-regulation for first-party websites has stagnated, while self-regulation for third parties has increased but is dominated by online advertising trade associations. Finally, we contribute to the literature on privacy regulation by demonstrating the historic impact of the GDPR on privacy policies.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.