State-of-the-art studies on cyberbullying detection, using text classification, predominantly take it for granted that streaming text can be completely labelled. However, the rapid growth of unlabelled data generated in real time from online content renders this virtually impossible. In this paper, we propose a session-based framework for automatic detection of cyberbullying within the large volume of unlabelled streaming text. Given that the streaming data from Social Networks arrives in large volume at the server system, we incorporate an ensemble of one-class classifiers in the session-based framework. System uses Multi-Agent distributed environment to process streaming data from multiple social network sources. The proposed strategy tackles real world situations, where only a few positive instances of cyberbullying are available for initial training. Our main contribution in this paper is to automatically detect cyberbullying in real world situations, where labelled data is not readily available. Initial results indicate the suggested approach is reasonably effective for detecting cyberbullying automatically on social networks. The experiments indicate that the ensemble learner outperforms the single window and fixed window approaches, while the learning process is based on positive and unlabelled data only, no negative data is available for training.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.