Information Disguise (ID), a part of computational ethics in Natural Language Processing (NLP ), is concerned with best practices of textual paraphrasing to prevent the non-consensual use of authors' posts on the Internet. Research on ID becomes important when authors' written online communication pertains to sensitive domains, e.g., mental health. Over time, researchers have utilized AI-based automated word spinners (e.g., SpinRewriter, WordAI) for paraphrasing content. However, these tools fail to satisfy the purpose of ID as their paraphrased content still leads to the source when queried on search engines. There is limited prior work on judging the effectiveness of paraphrasing methods for ID on search engines or their proxies, neural retriever (NeurIR) models. We propose a framework where, for a given sentence from an author's post, we perform iterative perturbation on the sentence in the direction of paraphrasing with an attempt to confuse the search mechanism of a NeurIR system when the sentence is queried on it. Our experiments involve the subreddit "r/AmItheAsshole" as the source of public content and Dense Passage Retriever as a NeurIR system-based proxy for search engines. Our work introduces a novel method of phrase-importance rankings using perplexity scores and involves multi-level phrase substitutions via beam search. Our multi-phrase substitution scheme succeeds in disguising sentences 82% of the time and hence takes an essential step towards enabling researchers to disguise sensitive content effectively before making it public. We also release the code of our approach. 4
In the last two decades, Information and Communication Technologies (ICTs) have played a pivotal role in empowering rural populations in India by making knowledge more accessible. Digital Green is one such ICT that employs a participatory approach with smallholder farmers to produce instructional agricultural videos that encompass content specific to them. With the help of human mediators, they disseminate these videos to farmers using projectors to improve the adoption of agricultural practices. Digital Green's web-based data tracker (CoCo) stores the attendance and adoption logs of millions of farmers, the videos screened to them and their demographic information. In our work, we leverage this data for a period of ten years between 2010-2020 across five states in India where Digital Green is most active and use it to conduct a holistic evaluation of the ICT. First, we find disparities in the adoption rates of farmers, following which we use statistical tests to identify the different factors that lead to these disparities as well as gender-based inequalities. We find that farmers with higher adoption rates adopt videos of shorter duration and belong to smaller
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.