A framework for sentiment analysis: a data-driven approach

Islam, Md. Jahedul; Sarker, Tonmoy; Shuvo, Md. Shubiour; Hossen, Md. Robin; Ahmedh, Minhaz Uddin

View/Open

17101430, 17301052, 17301132, 17301110, 17301087_CSE.pdf (1.008Mb)

Date

2021-06

Publisher

Brac University

Abstract

Internet is free and straightforward access to an immense measure of crude content information that can be mined for sentiment analysis. For a long time, this is being used for market research, user opinion mining, recommendation systems, analyze people’s views on a topic, etc. Many diﬀerent techniques have been developed, yet a lot of complication remains. Selecting and understanding attribute patterns in a text dataset is important to build a good model and know where this model can be used. Diﬀerent text datasets have diﬀerent relations between their attributes and classes. For example, let’s take a dataset with totally random English texts labelled as positive or negative. We expect to see that extracted attributes for the positive or negative class are very heavy with general words that we consider positive or negative in everyday English use. However, if the dataset is created on a niche topic, such as an economic, pandemic, etc, we would probably see that positive and negative classes are now heavy with words speciﬁc to these topics, or they may not be considered important at all by the classiﬁer. However, we might want to give importance to those niche-speciﬁc attributes speciﬁcally. In this paper, we take ﬁve diﬀerent datasets of diﬀerent instance lengths. We use Weka as a tool and go through some attribute selection techniques, do sentence-level sentiment analysis, and ﬁnally extract patterns from the datasets to analyze them. There are few related works on these datasets and our technique performed better than the existing works.We have been successful to beat Fuzzy method in terms of accuracy and better extraction of polarity in texts. Our approach have been proven to better work with the datasets than many former methods.In thispaper, we aim to present a method that can easily be fruitful to any dataset for textmining and can have a decent accuracy In this paper, we aim to present a method that can easily be fruitful to any dataset for text mining and can have a decent accuracy.

Keywords

Sentiment Analysis; Attribute selection; Pattern Extraction; Classiﬁcation; Accuracy; Application of Machine Learning

LC Subject Headings

Classiﬁcation; Machine learning

Description

This thesis is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science and Engineering, 2021.

Cataloged from PDF version of thesis.

Includes bibliographical references (page 28-30).

Department

Department of Computer Science and Engineering, Brac University

Type

Thesis

Collections

Thesis & Report, BSc (Computer Science and Engineering) [1589]