Social Media are sensors in the real world that can be used to measure the pulse of societies. However, the massive and unfiltered feed of messages posted in social media is a phenomenon that nowadays raises social alarms, especially when these messages contain hate speech targeted to a specific individual or group. In this context, governments and non-governmental organizations (NGOs) are concerned about the possible negative impact that these messages can have on individuals or on the society. In this paper, we present HaterNet, an intelligent system currently being used by the Spanish National Office Against Hate Crimes of the Spanish State Secretariat for Security that identifies and monitors the evolution of hate speech in Twitter. The contributions of this research are many-fold: (1) It introduces the first intelligent system that monitors and visualizes, using social network analysis techniques, hate speech in Social Media. (2) It introduces a novel public dataset on hate speech in Spanish consisting of 6000 expert-labeled tweets. (3) It compares several classification approaches based on different document representation strategies and text classification models. (4) The best approach consists of a combination of a LTSM+MLP neural network that takes as input the tweet’s word, emoji, and expression tokens’ embeddings enriched by the tf-idf, and obtains an area under the curve (AUC) of 0.828 on our dataset, outperforming previous methods presented in the literature.
In the current economic climate, law enforcement agencies are facing resource shortages. The effective and efficient use of scarce resources is therefore of the utmost importance to provide a high standard public safety service. Optimization models specifically tailored to the necessity of police agencies can help to ameliorate their use. The Multicriteria Police Districting Problem (MC-PDP) on a graph concerns the definition of sound patrolling sectors in a police district. The objective of this problem is to partition a graph into convex and continuous subsets, while ensuring efficiency and workload balance among the subsets. The model was originally formulated in collaboration with the Spanish National Police Corps. We propose for its solution three local search algorithms: a Simple Hill Climbing, a Steepest Descent Hill Climbing, and a Tabu Search. To improve their diversification capabilities, all the algorithms implement a multistart procedure, initialized by randomized greedy solutions. The algorithms are empirically tested on a case study on the Central District of Madrid. Our experiments show that the solutions identified by the novel Tabu Search outperform the other algorithms. Finally, research guidelines for future developments on the MC-PDP are given.
Miguel 2018. Applying automatic text-based detection of deceptive language to police reports: Extracting behavioral patterns from a multi-step classification model to understand how we lie to the police. Knowledge-Based Systems 149 , pp.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.