E-Community Health and Toxicity

Two HatMITACSOnline communities abound today, arising on social networking sites, on the websites of real-world communities like schools or clubs, on web discussion forums, on the discussion boards of videogames, and even on the comment pages of news sites and blogs. Some of these communities are “healthy” and foster polite discussion between respectful members, but others are “toxic” and devolve into virulent fights, trolling, cyber-bullying, fraud, or worse even, incitation to suicide, radicalization, or the sexual predation and grooming of minors. Detecting toxic messages and toxic users is a major challenge, in part because they are adversarial users who are actively trying to circumvent or fool detection software and filters. Moreover, while a lot of research in the literature has looked at online communities (for example in text normalization to correct misspelled words, in sentiment detection to infer the mood of users, or in user modeling to recognize the different personalities of users), most of it has assumed that users are collaborating rather than deliberately trying to misdirect the software.

The private company TwoHat has a software product, Community Sift, which assists community moderators in finding toxic messages in online conversations. In this research project, we will partner with them in order to achieve five general objectives, which will be detailed in later sections: (1) To explore improvements to the conversation handling tools and toxicity rating metrics within the context of the Community Sift system; (2) To research new methodologies for toxicity detection in online conversations; (3) To develop innovative algorithms to aggregate the various pieces of information in evidence files into a coherent evaluation and prediction; (4) To develop real-time implementations of these methodologies that can handle the massive data stream of online conversations; (5) To study the nature of toxic behaviours, their impacts on users and on online communities, and the mechanisms to curb them.