Crowd-Based Mining of Reusable Process Model Patterns

Rodríguez, Carlos; Daniel, Florian; Casati, Fabio

doi:10.1007/978-3-319-10172-9_4

Cited by 10 publications

(9 citation statements)

References 16 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

Section: Background and Motivationmentioning

confidence: 99%

“…Kamar and colleagues propose instead a promising approach where crowd features (votes, as well as potentially other aspects of the crowdsourcing process such as task completion times) and task features are combined into a broader set of features to be used to learn a model [Kamar et al 2012]. Researchers also explored using the crowd to extract features and patterns to then be leveraged by classifiers [Cheng and Bernstein 2015;Rodriguez et al 2014].…”

Section: Hybrid Classificationmentioning

confidence: 99%

“…Interesting ways to combine ML and H classifiers have also been proposed, for example by building ML classifiers that can also include crowd votes as features, whose purpose is not only that of providing better classification but also of weighing the value (vs the cost) of obtaining additional crowd votes [Kamar et al 2012]. Other hybrid approaches ask the crowd to extract interesting patterns to be then fed to algorithms for classification, as opposed to relying on ML to do this [Cheng and Bernstein 2015;Rodriguez et al 2014].The approach we propose leverages the information provided by each kind of classifier (machine and human) to improve the effectiveness of the other kind so that they can be stronger together. The algorithm is based on a probabilistic model that adapts to the specific characteristics of each item, screening filter, workers' accuracy, and ML classifier available to identify what to ask next for each item (which filter to test) and if we should stop polling the crowd for a given item, either because we reached a decision or because we realize we cannot do so cheaply and confidently (that is, the problem of classifying that item is too hard for the crowd-machine ensemble).…”

mentioning

confidence: 99%

See 2 more Smart Citations

Combining Crowd and Machines for Multi-predicate Item Screening

Krivosheev

Casati

Báez

et al. 2018

Proc. ACM Hum.-Comput. Interact.

Self Cite

View full text Add to dashboard Cite

This paper discusses how crowd and machine classifiers can be efficiently combined to screen items that satisfy a set of predicates. We show that this is a recurring problem in many domains, present machine-human (hybrid) algorithms that screen items efficiently and estimate the gain over human-only or machine-only screening in terms of performance and cost. We further show how, given a new classification problem and a set of classifiers of unknown accuracy for the problem at hand, we can identify how to manage the cost-accuracy trade off by progressively determining if we should spend budget to obtain test data (to assess the accuracy of the given classifiers), or to train an ensemble of classifiers, or whether we should leverage the existing machine classifiers with the crowd, and in this case how to efficiently combine them based on their estimated characteristics to obtain the classification. We demonstrate that the techniques we propose obtain significant cost/accuracy improvements with respect to the leading classification algorithms. ACM Reference Format:Evgeny Krivosheev, Fabio Casati, Marcos Baez, and Boualem Benatallah. 2019. Combining Crowd and Machines for Multi-predicate Item Screening. 1, 1 (April 2019), 18 pages. https://doi.org/0000001.0000001 BACKGROUND AND MOTIVATIONA frequently occurring classification problem consists in identifying items that pass a set of screening tests (filters). This is not only common in medical diagnosis but in many other fields as well, from database querying -where we filter tuples based on predicates [Parameswaran et al. 2014], to hotel search -where we filter places based on features of interest [Lan et al. 2017], to systematic literature reviews (SLR) -where we screen candidate papers based on a set of exclusion criteria to assess whether they are in scope for the review [Wallace et al. 2017]. The goal of this paper is to understand how, given a set of trained classifiers whose accuracy may or may not be known for the problem at hand (for a specific query predicate, hotel feature, or paper topic), we can combine machine learning (ML) and human (H) classifiers to create a hybrid classifier that screens items efficiently in terms of cost of querying the crowd, while ensuring an accuracy that is acceptable for the given problem. We focus specifically on the common scenario of finite pool problems, where the set of items to screen is limited and where therefore it may not be cost-effective to collect sufficient data to train accurate classifiers for each specific case. To make the paper easier to read :2 omitted et al. and the problem concrete, we will often take the example of SLRs mentioned above, which is rather challenging in that each SLR is different and each filtering predicate (called exclusion criterion in that context) could be unique to each SLR (e.g., "exclude papers that do not study adults 85+ years old").The area of crowd-only and of hybrid (ML+H) classification has received a lot of attention in the literature. Research in crowdsourcing has identified h...

show abstract

Section: Background and Motivationmentioning

confidence: 99%

Section: Hybrid Classificationmentioning

confidence: 99%

mentioning

confidence: 99%

See 1 more Smart Citation

Combining Crowd and Machines for Multi-predicate Item Screening

Krivosheev

Casati

Báez

et al. 2018

Proc. ACM Hum.-Comput. Interact.

Self Cite

View full text Add to dashboard Cite

show abstract

“…Approaches such as Colaba [10] that capture stakeholder rationales via arguments appear promising as do approaches such as Rodríguez et al's [32] that mine reusable model patterns from crowds. Protos [9] relates protocols to an abstract requirements engineering process, although it does not represent stakeholder rationales.…”

Section: Ongoing and Future Workmentioning

confidence: 99%

From Social Machines to Social Protocols

Chopra

Singh

2016

Proceedings of the 25th International Conference on World Wide Web

View full text Add to dashboard Cite

The overarching vision of social machines is to facilitate social processes by having computers provide administrative support. We conceive of a social machine as a sociotechnical system (STS): a software-supported system in which autonomous principals such as humans and organizations interact to exchange information and services. Existing approaches for social machines emphasize the technical aspects and inadequately support the meanings of social processes, leaving them informally realized in human interactions. We posit that a fundamental rethinking is needed to incorporate accountability, essential for addressing the openness of the Web and the autonomy of its principals.We introduce Interaction-Oriented Software Engineering (IOSE) as a paradigm expressly suited to capturing the social basis of STSs. Motivated by promoting openness and autonomy, IOSE focuses not on implementation but on social protocols, specifying how social relationships, characterizing the accountability of the concerned parties, progress as they interact. Motivated by providing computational support, IOSE adopts the accountability representation to capture the meaning of a social machine's states and transitions.We demonstrate IOSE via examples drawn from healthcare. We reinterpret the classical software engineering (SE) principles for the STS setting and show how IOSE is better suited than traditional software engineering for supporting social processes. The contribution of this paper is a new paradigm for STSs, evaluated via conceptual analysis.

show abstract

“…In Rodríguez et al [2014a], we studied one approach to mine mashup model patterns for Yahoo! Pipes with the help of the crowd (the Naïve approach presented in this article), compared it with our automated mining algorithm described in Rodríguez et al [2014b], and discussed its applicability to business process models.…”

Section: Introductionmentioning

confidence: 99%

Mining and Quality Assessment of Mashup Model Patterns with the Crowd

Rodríguez

Daniel

Casati

2016

ACM Trans. Internet Technol.

Self Cite

View full text Add to dashboard Cite

Pattern mining, that is, the automated discovery of patterns from data, is a mathematically complex and computationally demanding problem that is generally not manageable by humans. In this article, we focus on small datasets and study whether it is possible to mine patterns with the help of the crowd by means of a set of controlled experiments on a common crowdsourcing platform. We specifically concentrate on mining model patterns from a dataset of real mashup models taken from Yahoo! Pipes and cover the entire pattern mining process, including pattern identification and quality assessment. The results of our experiments show that a sensible design of crowdsourcing tasks indeed may enable the crowd to identify patterns from small datasets (40 models). The results, however, also show that the design of tasks for the assessment of the quality of patterns to decide which patterns to retain for further processing and use is much harder (our experiments fail to elicit assessments from the crowd that are similar to those by an expert). The problem is relevant in general to model-driven development (e.g., UML, business processes, scientific workflows), in that reusable model patterns encode valuable modeling and domain knowledge, such as best practices, organizational conventions, or technical choices, that modelers can benefit from when designing their own models. ACM Reference Format:Carlos Rodríguez, Florian Daniel, and Fabio Casati. 2016. Mining and quality assessment of mashup model patterns with the crowd: A feasibility study.

show abstract

Crowd-Based Mining of Reusable Process Model Patterns

Cited by 10 publications

References 16 publications

Combining Crowd and Machines for Multi-predicate Item Screening

Combining Crowd and Machines for Multi-predicate Item Screening

From Social Machines to Social Protocols

Mining and Quality Assessment of Mashup Model Patterns with the Crowd

Contact Info

Product

Resources

About