Identifying code-mixed and code-switched hateful remarks on social media using NLP

Sinha, Sumaiya; Nawar, Naharin Siddiqui; Khan, Md. Abrar Faiaz

View/Open

20101141, 24141298, 19301106_CSE.pdf (601.0Kb)

Date

2024-05

Publisher

Brac University

Abstract

Online bullying has prevailed for years in the vast cesspool that is commonly known as the online social media. Increasing use of social media and online communication has led to a rise in cyberbullying– which is often facilitated by the abundant usage of code-mixing and code-switching. Research has been done to filter out these derogatory remarks. However, little research has been done on code-switched and code-mixed hateful remarks. English has blended into our Bangla language so effectively that people regularly use English letters to convey Bangla due to its convenience. English and Bangla are used interchangeably in regular conversations as well. Our main objective in this research is to detect these code-switched and code-mixed remarks– which we plan to do by taking advantage of the state-of-theart natural language processing technologies.

Keywords

Cyberbullying; Cyber harassment; Online bullying; Social media; Hate speech; NLP

LC Subject Headings

Natural language processing (Computer science).; Automatic speech recognition.; Deep learning (Machine learning).

Description

This thesis is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science, 2024.

Cataloged from PDF version of thesis.

Includes bibliographical references (pages 43-45).

Department

Department of Computer Science and Engineering, Brac University

Type

Thesis

Collections

Thesis & Report, BSc (Computer Science and Engineering) [1586]