• Login
    • Library Home
    View Item 
    •   BracU IR
    • School of Data and Sciences (SDS)
    • Department of Computer Science and Engineering (CSE)
    • Thesis & Report, BSc (Computer Science and Engineering)
    • View Item
    •   BracU IR
    • School of Data and Sciences (SDS)
    • Department of Computer Science and Engineering (CSE)
    • Thesis & Report, BSc (Computer Science and Engineering)
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Identification of fake news using machine learning in distributed system

    Thumbnail
    View/Open
    19101665, 19201139, 19301277, 15301101, 17101176_CSE.pdf (1.094Mb)
    Date
    2021-06
    Publisher
    Brac University
    Author
    Saif, Mehruz
    Kanon, MD. Kamal Haque
    Hasan, Nazmul
    Hossen, MD. Shamim
    Anannya, Fatema Zohra
    Metadata
    Show full item record
    URI
    http://hdl.handle.net/10361/15199
    Abstract
    The World Wide Web’s launch and the rapid adoption of social media platforms (such as Facebook and Twitter) paved the way for unparalleled levels of information diffusion in human history. Consumers are creating and sharing more information on social media platforms than ever before, some of it is erroneous, deceptive, or has no influence on reality. Access to news information has become considerably simpler and more comfortable thanks to the Internet and social media. Online users may often follow events of interest, and the widespread usage of mobile devices makes this process easier. However, with great potential comes enormous responsibility. There are also a number of websites dedicated nearly entirely to the dissemination of fake news. Since it’s a serious issue with a large-scale dataset, identification of fake news is very vital in this era, as social media and online newspapers are in large numbers in the web arena. That’s why it is easy to spread rumors and create chaos. Also, the size of data sets is increasing day by day. Data is expanding at a quicker rate than processing rates. As a result, algorithms that need a huge quantity of data and processing are frequently conducted on a distributed computing system that separates multiple nodes on several machines which have concurrency of components and lack of a global clock. Also, nobody has used a distributed system to detect fake news before. In our paper, we tried to run 4 PySpark algorithms based on SPARK-Context which provides massive storage for big data processing and analysis and also has been found to be 100 times quicker in-memory, while disk performance was shown to be 10 times quicker on several devices at the same time. So that we can control and real-time monitoring over the news and data before it goes viral in the media.
    Keywords
    PySpark ML; RDD(Resilient Distributed Dataset); Random Forest; Factorization Machine Classifier; Linear SVC; Logistic Regression
     
    LC Subject Headings
    Machine Learning
     
    Description
    This thesis is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science and Engineering, 2021.
     
    Cataloged from PDF version of thesis.
     
    Includes bibliographical references (page 41-42).
    Department
    Department of Computer Science and Engineering, Brac University
    Collections
    • Thesis & Report, BSc (Computer Science and Engineering)

    Copyright © 2008-2019 Ayesha Abed Library, Brac University 
    Contact Us | Send Feedback
     

     

    Policy Guidelines

    • BracU Policy
    • Publisher Policy

    Browse

    All of BracU Institutional RepositoryCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

    My Account

    LoginRegister

    Statistics

    View Usage Statistics

    Copyright © 2008-2019 Ayesha Abed Library, Brac University 
    Contact Us | Send Feedback