Automatic text summarization using Gensim Word2Vec and K-Means Clustering Algorithm

Haider, Mofiz Mojib; Hossin, Md. Arman; Mahi, Hasibur Rashid

dc.contributor.advisor	Arif, Hossain
dc.contributor.author	Haider, Mofiz Mojib
dc.contributor.author	Hossin, Md. Arman
dc.contributor.author	Mahi, Hasibur Rashid
dc.date.accessioned	2021-05-29T15:48:36Z
dc.date.available	2021-05-29T15:48:36Z
dc.date.copyright	2020
dc.date.issued	2020-04
dc.identifier.other	ID: 16301038
dc.identifier.other	ID: 17301214
dc.identifier.other	ID: 16301035
dc.identifier.uri	http://dspace.bracu.ac.bd/xmlui/handle/10361/14446
dc.description	This thesis is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science and Engineering, 2020.	en_US
dc.description	Cataloged from PDF version of thesis.
dc.description	Includes bibliographical references (pages 28-29).
dc.description.abstract	The significance of text summarization in the Natural Language Processing (NLP) community has now expanded because of the staggering increase in virtual textual materials. Text summary is the process created from one or multiple texts which convey important insight in a little form of the main text. Multiple text summarization technique assists to pick indispensable points of the original texts reducing time and effort require reading the whole document. The question was approached from a different point of view, in a different domain by using different concepts. Extractive and abstractive are the two main methods of summing up text. Though extractive summary is primarily concerned with what summary content the frequency of words, phrases, and sentences from the original document should be used. This research proposes a sentence based clustering algorithm (K-Means) for a single document. For feature extraction, we have used Gensim word2vec which is intended to automatically extract semantic topics from documents in the most efficient way possible.	en_US
dc.description.statementofresponsibility	Mofiz Mojib Haider
dc.description.statementofresponsibility	Md. Arman Hossin
dc.description.statementofresponsibility	Hasibur Rashid Mahi
dc.format.extent	29 pages
dc.language.iso	en_US	en_US
dc.publisher	Brac University	en_US
dc.rights	Brac University theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission.
dc.subject	Text summarization	en_US
dc.subject	Extractive	en_US
dc.subject	Single Document	en_US
dc.subject	NLP	en_US
dc.subject	Gensim	en_US
dc.subject	Word2Vec	en_US
dc.subject	K-Means	en_US
dc.title	Automatic text summarization using Gensim Word2Vec and K-Means Clustering Algorithm	en_US
dc.type	Thesis	en_US
dc.contributor.department	Department of Computer Science and Engineering, Brac University
dc.description.degree	B. Computer Science

Files in this item

Name:: 16301038, 17301214, 16301035_C ...
Size:: 1.150Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Thesis & Report, BSc (Computer Science and Engineering) [1480]

Show simple item record