Identifying code-mixed and code-switched hateful remarks on social media using NLP
Abstract
Online bullying has prevailed for years in the vast cesspool that is commonly known
as the online social media. Increasing use of social media and online communication
has led to a rise in cyberbullying– which is often facilitated by the abundant
usage of code-mixing and code-switching. Research has been done to filter out
these derogatory remarks. However, little research has been done on code-switched
and code-mixed hateful remarks. English has blended into our Bangla language
so effectively that people regularly use English letters to convey Bangla due to its
convenience. English and Bangla are used interchangeably in regular conversations
as well. Our main objective in this research is to detect these code-switched and
code-mixed remarks– which we plan to do by taking advantage of the state-of-theart
natural language processing technologies.