In this world of information and experience era, microblogging sites have been commonly used to express people feelings including fear, panic, hate and abuse. Monitoring and control of abuse on social media, especially during pandemics such as COVID-19, can help in keeping the public sentiment and morale positive. Developing the fear and hate detection methods based on machine learning requires labelled data. However, obtaining the labelled data in suddenly changed circumstances as a pandemic is expensive and acquiring them in a short time is impractical. Related labelled hate data from other domains or previous incidents may be available. However, the predictive accuracy of these hate detection models decreases significantly if the data distribution of the target domain, where the prediction will be applied, is different. To address this problem, we propose a novel concept of unsupervised progressive domain adaptation based on a deep-learning language model generated through multiple text datasets. We showcase the efficacy of the proposed method in hate speech and fear detection on the tweets collection during COVID-19 where the labelled information is unavailable.
Abstract-The advancement of smartphones with various type of sensors enabled us to harness diverse information with crowd sensing mobile application. However, traditional approaches have suffered drawbacks such as high battery consumption as a trade off to obtain high accuracy data using high sampling rate. To mitigate the battery consumption, we proposed low sampling point of interest (POI) extraction framework, which is built upon validation based stay points detection (VSPD) and sensor fusion based environment classification (SFEC). We studied various of clustering algorithm and showed that density based spatial clustering of application with noise (DBSCAN) algorithms produce most accurate result among existing methods. The SFEC model is utilized for classifying the indoor or outdoor environment of the POI clustered earlier by VSPD. Real world data are collected, benchmarked using existing clustering method to denote effectiveness of low sampling rate model in high noise spatial temporal data.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.