Show simple item record

dc.contributor.advisorChakrabarty, Amitabha
dc.contributor.authorRazzak, Razia
dc.contributor.authorSadril, Md.
dc.contributor.authorShakil, Mahmudul Hasan
dc.contributor.authorRahman, Mahfuzur
dc.contributor.authorTaki, Sabiha Tul Omman
dc.date.accessioned2021-07-15T06:18:46Z
dc.date.available2021-07-15T06:18:46Z
dc.date.copyright2021
dc.date.issued2021-01
dc.identifier.otherID: 16101291
dc.identifier.otherID: 16301032
dc.identifier.otherID: 16301026
dc.identifier.otherID: 16101206
dc.identifier.otherID: 17101519
dc.identifier.urihttp://hdl.handle.net/10361/14810
dc.descriptionThis thesis is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science and Engineering, 2021.en_US
dc.descriptionCataloged from PDF version of thesis.
dc.descriptionIncludes bibliographical references (pages 54-56).
dc.description.abstractThe rapid growth of information technology and the disruptive transformation of social media have happened in recent years. Websites like Facebook, Twitter, Instagram, where people can express their thoughts or feelings by posting text, photos or videos, have become incredibly popular. But unfortunately, it has also become a place for hateful activity, abusive words, cyberbullying and anonymous threats. There are many existing works in this field but those are not fully successful yet to provide accuracy in satisfactory level. In this work, we employ natural language processing (NLP) with convolution neural networking (CNN), extreme gradient boosting (XGBoost) and support vector machine (SVM) for segmenting toxic comments at first and then classifying them in six types from a large pool of documents provided by Kaggle’s regarding Wikipedia’s talk page edits. Using this dataset, the hamming score of CNN model is 89% ,XGBoost model is 87% and SVM model is 84%.en_US
dc.description.statementofresponsibilityRazia Razzak
dc.description.statementofresponsibilityMd. Sadril
dc.description.statementofresponsibilityMahmudul Hasan Shakil
dc.description.statementofresponsibilityMahfuzur Rahman
dc.description.statementofresponsibilitySabiha Tul Omman Taki
dc.format.extent56 Pages
dc.language.isoen_USen_US
dc.publisherBrac Universityen_US
dc.rightsBrac University theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission.
dc.subjectCyberbullyingen_US
dc.subjectNatural Language Processingen_US
dc.subjectWord Embeddingen_US
dc.subjectConvolutional Neural Networksen_US
dc.subjectXGBoosten_US
dc.subjectSupport Vector Machineen_US
dc.titleComparative study of toxic comments classification using machine learning algorithmsen_US
dc.typeThesisen_US
dc.contributor.departmentDepartment of Computer Science and Engineering, Brac University
dc.description.degreeB. Computer Science


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record