dc.contributor.advisor | Mostakim, Moin | |
dc.contributor.author | Bushra, Tabassum Khan | |
dc.contributor.author | Saha, Kallol | |
dc.contributor.author | Mulki, Ammin Hossain | |
dc.contributor.author | Khan, Sanjana Sabah | |
dc.contributor.author | Binta Amzad, Afrin | |
dc.date.accessioned | 2023-04-03T08:00:38Z | |
dc.date.available | 2023-04-03T08:00:38Z | |
dc.date.copyright | 2022 | |
dc.date.issued | 2022-10 | |
dc.identifier.other | ID: 18101163 | |
dc.identifier.other | ID: 18101461 | |
dc.identifier.other | ID: 18101468 | |
dc.identifier.other | ID: 18101502 | |
dc.identifier.other | ID: 19301267 | |
dc.identifier.uri | http://hdl.handle.net/10361/18070 | |
dc.description | This thesis is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science, 2022. | en_US |
dc.description | Cataloged from PDF version of thesis. | |
dc.description | Includes bibliographical references (pages 67-68). | |
dc.description.abstract | As one of the fastest and most prominent deep learning technologies being fiddled
with today, sentiment analysis is capable of revealing an individual’s true emotions
by analyzing their facial speech, text, facial expressions, gestures, and so on. The
technology is being constantly used to understand how different individuals feel or
react when they are put under certain circumstances or situations. The information
obtained from such analyses is then processed to unravel the subject’s sentimental
reactions to said circumstances and situations which can further be utilized in a
magnitude of ways. While the technology itself is constantly being improved upon,
opportunities still exist to make it more efficient. This research aims to use a va riety of machine learning algorithms and language models for sentiment detection
in textual data, and understand how each of these algorithms and models approach
the problems presented to them through the textual data. This is to be achieved
utilizing five models that fall under three pairs namely primitive or simple models
featuring TF-IDF and Bag of Words; mid complexity models featuring Naive Bayes;
and advanced context-identifying state-of-the-art models namely LSTM and BERT.
The datasets for this research include the Spotify App Reviews Dataset and 100K
Coursera’s Course Reviews Dataset. We used 10000 samples from these datasets
for our research. After running the suggested models, the research aims to discover
which of them works best and on which datasets, whether or not there are any
similarity patterns between them, and whether or not any of the suggested models
provide poor or disappointing results, all of which are provided in descriptive and
quantified forms, as well as through graphical representation. For 5 label sentiment
classification, Multinomial Naive Bayes gave the highest accuracy score for both
the Coursera’s Course Review and LSTM scored highest for Spotify App Review
dataset which are 74.81% and 62.7%. For 3 label classification, pretrained BERT
gave the highest accuracy score for the Coursera dataset and LSTM gave the highest
score for Spotify dataset which are 91.2% and 78.3% respectively. However since
our datasets very highly imbalanced, the accuracy score is a poor metric for per formance evaluation of the algorithms so we looked at the f1 scores instead. We
have also addressed the imbalance in out datasets by using different bias handling
techniques, such as random oversampling of the minority classes. We finally reached
the conclusion that both LSTM and BERT performed the best for both datasets
after carefully observing the f1 scores for all the class predictions for our algorithms
in both cases of sentiment label categorization. | en_US |
dc.description.statementofresponsibility | Tabassum Khan Bushra | |
dc.description.statementofresponsibility | Kallol Saha | |
dc.description.statementofresponsibility | Ammin Hossain Mulki | |
dc.description.statementofresponsibility | Sanjana Sabah Khan | |
dc.description.statementofresponsibility | Afrin Binta Amzad | |
dc.format.extent | 68 pages | |
dc.language.iso | en | en_US |
dc.publisher | Brac University | en_US |
dc.rights | Brac University theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. | |
dc.subject | BERT | en_US |
dc.subject | Bag of Words | en_US |
dc.subject | TF-IDF | en_US |
dc.subject | Naive Bayes | en_US |
dc.subject | LSTM | en_US |
dc.subject.lcsh | Machine learning | |
dc.title | Recognizing sentimental emotions in text by using Machine Learning | en_US |
dc.type | Thesis | en_US |
dc.contributor.department | Department of Computer Science and Engineering, Brac University | |
dc.description.degree | B. Computer Science | |