Identifying Bangla deceptive news using machine learning and deep learning algorithms
Abstract
Internet-based resources are utilized by the vast majority of individuals today. The
news published on websites and shared on social media platforms are examples of
such resources. Due to the increasing number of content creators, online media
portals, and news portals, it has become nearly impossible to verify the veracity
of news headlines and undertake thorough assessments of them. The overwhelming majority of fraudulent headlines contain misleading or false information. They
obtain more views and shares from people of all ages by using clickbait titles that
contain fictitious terms or false information. However, these false and misleading
headlines cause chaos in the lives of the average individual and mislead them in
numerous ways. We have used recent Bangla news articles to create a model that
can accurately determine the reliability of the news. In order to detect fake Bangla
news stories, we have used approximately 10,000 news articles to train our machine
learning and deep learning model. In addition, the Bengali language uses BNLP and
BLTK for a wide range of natural language processing activities and bn_w2v_wiki
a word embedding model for Bangla Language to represent words as vectors. The
Synthetic Minority Oversampling Strategy (SMOTE) was used to remove the imbalance of our dataset. On the training data of our dataset, we have employed machine
learning in addition to deep learning algorithm. Our deep learning model LSTM
performs best with the accuracy of 91% . Also our machine learning model Random
Forest and Support Vector Machine performs well enough to compete with LSTM
for the prediction of fake news. The other machine learning algorithms included are
LR, KNN, GNB, bagging, boosting. Furthermore, we have developed a website that
takes Bangla news text as input and classifies the news with the help of our trained
model. We believe our study will go a long way towards establishing a foundation
in the research field of low resourced Bangla Language and open new door to future
study.