Toxic language is often present in online forums, especially when politics and other polarizing topics arise, and can lead to people becoming discouraged from joining or continuing conversations. In this paper, I use data consisting of comments with the indices of toxic text labelled to train an RNN to determine which parts of the comments make them toxic, which could aid online moderators. I compare results using both the original dataset and an augmented set, as well as GRU versus LSTM RNN models.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.