This paper demonstrates PERC-our system for crowdsourced entity resolution with human errors. Entity Resolution (ER) is a critical step in data cleaning and analytics. Although many machinebased methods existed for ER task, crowdsourcing is becoming increasingly important since humans can provide more insightful information for complex tasks, e.g., clustering of images and natural language processing. However, human workers still make mistakes due to lack of domain expertise or seriousness, ambiguity, or even malicious intent. To this end, we present a system, called PERC (probabilistic entity resolution with crowd errors), which adopts an uncertain graph model to address the entity resolution problem with noisy crowd answers. Using our framework, the problem of ER becomes equivalent to finding the maximum-likelihood clustering. In particular, we propose a novel metric called "reliability" to measure the quality of a clustering, which takes into account both the connected-ness inside and across all clusters. PERC then automatically selects the next question to ask the crowd that maximally increases the "reliability" of the current clustering. This demonstration highlights (1) a reliability-based next crowdsourcing framework for crowdsourced ER, which does not require any user-defined threshold, and no apriori information about the error rate of the crowd workers, (2) it improves the ER quality by 15% and reduces the crowdsourcing cost by 50% compared to state-ofthe-art methods, and (3) its GUI can interact with users to help them compare different crowdsourced ER algorithms, their intermediate ER results as they progress, and their selected next crowdsourcing questions in a user-friendly manner. Our demonstration video is at: https://www.youtube.com/watch?v=rQ7nu3b8zXY.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.