Identification of fake news using machine learning in distributed system

Saif, Mehruz; Kanon, MD. Kamal Haque; Hasan, Nazmul; Hossen, MD. Shamim; Anannya, Fatema Zohra

dc.contributor.advisor	Akhond, Mostafijur Rahman
dc.contributor.author	Saif, Mehruz
dc.contributor.author	Kanon, MD. Kamal Haque
dc.contributor.author	Hasan, Nazmul
dc.contributor.author	Hossen, MD. Shamim
dc.contributor.author	Anannya, Fatema Zohra
dc.date.accessioned	2021-10-10T09:28:29Z
dc.date.available	2021-10-10T09:28:29Z
dc.date.copyright	2021
dc.date.issued	2021-06
dc.identifier.other	ID 19101665
dc.identifier.other	ID 19201139
dc.identifier.other	ID 19301277
dc.identifier.other	ID 15301101
dc.identifier.other	ID 17101176
dc.identifier.uri	http://hdl.handle.net/10361/15199
dc.description	This thesis is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science and Engineering, 2021.	en_US
dc.description	Cataloged from PDF version of thesis.
dc.description	Includes bibliographical references (page 41-42).
dc.description.abstract	The World Wide Web’s launch and the rapid adoption of social media platforms (such as Facebook and Twitter) paved the way for unparalleled levels of information diffusion in human history. Consumers are creating and sharing more information on social media platforms than ever before, some of it is erroneous, deceptive, or has no influence on reality. Access to news information has become considerably simpler and more comfortable thanks to the Internet and social media. Online users may often follow events of interest, and the widespread usage of mobile devices makes this process easier. However, with great potential comes enormous responsibility. There are also a number of websites dedicated nearly entirely to the dissemination of fake news. Since it’s a serious issue with a large-scale dataset, identification of fake news is very vital in this era, as social media and online newspapers are in large numbers in the web arena. That’s why it is easy to spread rumors and create chaos. Also, the size of data sets is increasing day by day. Data is expanding at a quicker rate than processing rates. As a result, algorithms that need a huge quantity of data and processing are frequently conducted on a distributed computing system that separates multiple nodes on several machines which have concurrency of components and lack of a global clock. Also, nobody has used a distributed system to detect fake news before. In our paper, we tried to run 4 PySpark algorithms based on SPARK-Context which provides massive storage for big data processing and analysis and also has been found to be 100 times quicker in-memory, while disk performance was shown to be 10 times quicker on several devices at the same time. So that we can control and real-time monitoring over the news and data before it goes viral in the media.	en_US
dc.description.statementofresponsibility	Mehruz Saif
dc.description.statementofresponsibility	MD. Kamal Haque Kanon
dc.description.statementofresponsibility	Nazmul Hasan
dc.description.statementofresponsibility	MD. Shamim Hossen
dc.description.statementofresponsibility	Fatema Zohra Anannya
dc.format.extent	42 pages
dc.language.iso	en	en_US
dc.publisher	Brac University	en_US
dc.rights	Brac University theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission.
dc.subject	PySpark ML	en_US
dc.subject	RDD(Resilient Distributed Dataset)	en_US
dc.subject	Random Forest	en_US
dc.subject	Factorization Machine Classifier	en_US
dc.subject	Linear SVC	en_US
dc.subject	Logistic Regression	en_US
dc.subject.lcsh	Machine Learning
dc.title	Identification of fake news using machine learning in distributed system	en_US
dc.type	Thesis	en_US
dc.contributor.department	Department of Computer Science and Engineering, Brac University
dc.description.degree	B. Computer Science

Files in this item

Name:: 19101665, 19201139, 19301277, ...
Size:: 1.094Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Thesis & Report, BSc (Computer Science and Engineering) [1480]

Show simple item record