2018
DOI: 10.1145/3274366
|View full text |Cite
|
Sign up to set email alerts
|

Combining Crowd and Machines for Multi-predicate Item Screening

Abstract: This paper discusses how crowd and machine classifiers can be efficiently combined to screen items that satisfy a set of predicates. We show that this is a recurring problem in many domains, present machine-human (hybrid) algorithms that screen items efficiently and estimate the gain over human-only or machine-only screening in terms of performance and cost. We further show how, given a new classification problem and a set of classifiers of unknown accuracy for the problem at hand, we can identify how to manag… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
11
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
4
3
1

Relationship

3
5

Authors

Journals

citations
Cited by 17 publications
(11 citation statements)
references
References 28 publications
0
11
0
Order By: Relevance
“…Where multiple elements of the articles need to be assessed, machine learning can require considerable costs related to training. Although a comparison of crowdsourcing with text-mining performance is valid, it is also worthwhile considering that by combining machine learning and crowdsourcing together may lead to the greatest workload reduction for the crowd and investigative teams [43,44]. This hybrid approach has been researched and applied in a variety of fields outside the SR field.…”
Section: Discussionmentioning
confidence: 99%
“…Where multiple elements of the articles need to be assessed, machine learning can require considerable costs related to training. Although a comparison of crowdsourcing with text-mining performance is valid, it is also worthwhile considering that by combining machine learning and crowdsourcing together may lead to the greatest workload reduction for the crowd and investigative teams [43,44]. This hybrid approach has been researched and applied in a variety of fields outside the SR field.…”
Section: Discussionmentioning
confidence: 99%
“…In this section, we examine the behavior of AL approaches in crowdsourcing settings. Specifically, we focus on problems where we start from a blank slate, have a pool of items to classify and a crowd at our disposal, and need not only to choose/assess AL approaches but also to assess if the crowd is leveraged only to get labeled data for training or also to perform classification at inference time, as done in hybrid classification contexts (Krivosheev et al 2018a;Callaghan et al 2018).…”
Section: Experimental Workmentioning
confidence: 99%
“…The Amazon Sentiment-1 4 dataset (Krivosheev et al 2018a) includes annotations about deciding whether the given product review belongs to a book or not. Similarly, the Amazon Sentiment-2 4 dataset (Krivosheev et al 2018a) includes annotations about whether the given product review has a negative or positive sentiment. The Crisis-1 5 dataset (Imran et al 2013) consists of human-labeled tweets collected during the 2012 Hurricane Sandy and the 2011 Joplin tornado.…”
Section: Datasetsmentioning
confidence: 99%
“…Text classification, in particular, is a recurrent goal of machine learning (ML) projects, and a typical task in crowdsourcing platforms. Hybrid approaches, combining ML and crowd efforts, have been proposed to boost accuracy and reduce costs [2–4]. One possibility is to use automatic techniques for highlighting relevant excerpts in the text and then ask workers to classify.…”
Section: Objectivementioning
confidence: 99%