Bengali text summarization using TextRank, Fuzzy C-means and aggregated scoring techniques

Rahman, Alvee; Rafiq, Fahim Md; Saha, Ramkrishna; Rafian, Ruhit

View/Open

15101036,15101056,15101024,14201028_CSE.pdf (1.481Mb)

Date

2018-12

Publisher

BRAC University

Abstract

In this world, it is very difficult and time consuming for humans to summarize large documents, reports, news and research articles. Multiple text summarization techniques play vital roles in picking the important points and sentences thus reducing the time and effort required to read a whole article. Numerous summarization techniques have been applied to the English language but comparatively work on Bengali text summarization is still limited. Furthermore, in our country, Bangladesh, all summarization is mainly done by humans. Keeping that in mind we aim to find a simple way of summarizing Bengali texts with the technology at hand. Text summarization can be of two types, either abstractive or extractive. In this paper we will use extractive text summarization to summarize Bengali passages, using Fuzzy C-Means, TextRank and Aggregate Sentence Scoring methodologies. We have also done a comparative study, among the 3 methodologies we have used and aim to find the most precise methodology for Bengali text summarization.

Keywords

TextRank; C-means; Text summarization

Description

This thesis is submitted in partial fulfilment of the requirements for the degree of Bachelor of Science in Computer Science and Engineering, 2018.

Includes bibliographical references (pages 37-39).

Cataloged from PDF version of thesis.

Department

Department of Computer Science and Engineering, BRAC University

Type

Thesis

Collections

Thesis & Report, BSc (Computer Science and Engineering) [1480]