Proceedings of the 5th Workshop on Noisy User-Generated Text (W-Nut 2019) 2019
DOI: 10.18653/v1/d19-5507
|View full text |Cite
|
Sign up to set email alerts
|

Character-Based Models for Adversarial Phone Extraction: Preventing Human Sex Trafficking

Abstract: Illicit activity on the Web often uses noisy text to obscure information between client and seller, such as the seller's phone number. This presents an interesting challenge to language understanding systems; how do we model adversarial noise in a text extraction system? This paper addresses the sex trafficking domain, and proposes some of the first neural network architectures to learn and extract phone numbers from noisy text. We create a new adversarial advertisement dataset, propose several RNN-based model… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
2
2

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(7 citation statements)
references
References 17 publications
0
7
0
Order By: Relevance
“…We use the term adult service providers (ASP) to refer to a worker in the sex industry (also referred to as the sex trade) that offers sexual services. Given that an ASP is seeking paying clients and that clients must have a method by which to contact them, linking ads which use the same phone number is logical and has proved to be an effective strategy (Chambers et al 2019, Dubrawski et al 2015, Ibanez and Gazan 2016a. Over time, phone numbers have become inexpensive, that is, easy to obtain and change, making them a less reliable proxy for an individual.…”
Section: Human Trafficking and Online Sex Advertisementsmentioning
confidence: 99%
See 1 more Smart Citation
“…We use the term adult service providers (ASP) to refer to a worker in the sex industry (also referred to as the sex trade) that offers sexual services. Given that an ASP is seeking paying clients and that clients must have a method by which to contact them, linking ads which use the same phone number is logical and has proved to be an effective strategy (Chambers et al 2019, Dubrawski et al 2015, Ibanez and Gazan 2016a. Over time, phone numbers have become inexpensive, that is, easy to obtain and change, making them a less reliable proxy for an individual.…”
Section: Human Trafficking and Online Sex Advertisementsmentioning
confidence: 99%
“…Given that an ASP is seeking paying clients and that clients must have a method by which to contact them, linking ads which use the same phone number is logical and has proved to be an effective strategy (Chambers et al. 2019, Dubrawski et al. 2015, Ibanez and Gazan 2016a).…”
Section: Introductionmentioning
confidence: 99%
“…However, these indicators can only be studied within a cluster of ads linked with individual vendor accounts. Previous work by Chambers et al (2019) proposed using neural networks to extract phone numbers to connect these escort ads. Nonetheless, our research demonstrates that only 37% of the ads in our dataset contained phone numbers.…”
Section: Related Researchmentioning
confidence: 99%
“…Figure 2(A) and figure 2(B) show that most ads in our dataset (approximately 99%) have a sentence length below 512 tokens and 2,000 characters. To generate ground truth, i.e., vendor labels, we employ the TJBatchExtractor Nagpal et al (2017b) and CNN-LSTM-CRF classifier Chambers et al (2019) to extract phone numbers from the ads. Subsequently, we utilize NetworkX Hagberg et al (2008) to create vendor communities based on these phone numbers.…”
Section: Datasetmentioning
confidence: 99%
“…• TJBatch -The state-of-the-art named entity extractor (Dubrawski et al, 2015;Chambers et al, 2019) in the human trafficking domain. This method extracts words from a dictionary and is based on manually designed regex rules.…”
Section: Introductionmentioning
confidence: 99%