BnText2Table – dataset and Text-to-Table generation in Bangla

Zariyat, Tahreema Rahman; Ahmed, Fahim Irfan; Oishi, Tahsina Tajrim; Morshed, Maruf

dc.contributor.advisor	Islam, Md Saiful
dc.contributor.author	Zariyat, Tahreema Rahman
dc.contributor.author	Ahmed, Fahim Irfan
dc.contributor.author	Oishi, Tahsina Tajrim
dc.contributor.author	Morshed, Maruf
dc.date.accessioned	2024-08-19T06:13:34Z
dc.date.available	2024-08-19T06:13:34Z
dc.date.copyright	2024
dc.date.issued	2024-01
dc.identifier.other	ID 20101433
dc.identifier.other	ID 20101508
dc.identifier.other	ID 20101394
dc.identifier.other	ID 20101299
dc.identifier.uri	http://hdl.handle.net/10361/23795
dc.description	This thesis is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science, 2024.	en_US
dc.description	Cataloged from PDF version of thesis.
dc.description	Includes bibliographical references (pages 32-34).
dc.description.abstract	"In this fast-paced world, everyone relies on technology to get their work done quickly and efficiently, since using technology greatly simplifies every task that needs to be done. The majority of the publications are lengthy and packed with crucial data. However, in many instances, extra words are also added to boost the word count, which causes a number of difficulties when trying to get the desired information. For the English language, numerous tools are available to summarize the text and present it in tabular form. However, it is not the same for our mother tongue, Bangla. Despite being the 5th most-spoken native language in the world, there is no tool available to ease the workload in Bengali language. Our research will assist in such circumstances by summarizing the given information in tabular form within the shortest possible time. Since there is no dataset available that will be suitable for our research, we have prepared the dataset ourselves. Then, we have used the mBART-50-large, mT5-base, mT5-m2m-CrossSum and BanglaT5 models for the implementation. Finding the appropriate table headers in light of the context and order of the data is the most important task in this study. To sum up, our main goal is to develop a benchmark dataset for a text-to-table model for the betterment of the NLP research community."	en_US
dc.description.statementofresponsibility	Tahreema Rahman Zariyat
dc.description.statementofresponsibility	Fahim Irfan Ahmed
dc.description.statementofresponsibility	Tahsina Tajrim Oishi
dc.description.statementofresponsibility	Maruf Morshed
dc.format.extent	34 pages
dc.language.iso	en	en_US
dc.publisher	Brac University	en_US
dc.rights	Brac University theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission.
dc.subject	Bangla NLP	en_US
dc.subject	Text2Table	en_US
dc.subject	Summarizer	en_US
dc.subject	mBART	en_US
dc.subject	Transformer	en_US
dc.subject	Information extraction	en_US
dc.subject	T5	en_US
dc.subject	mT5	en_US
dc.subject.lcsh	Computation and Language
dc.title	BnText2Table – dataset and Text-to-Table generation in Bangla	en_US
dc.type	Thesis	en_US
dc.contributor.department	Department of Computer Science and Engineering, Brac University
dc.description.degree	B.Sc. in Computer Science

Files in this item

Name:: 20101433_20101508_20101394_201 ...
Size:: 2.439Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Thesis & Report, BSc (Computer Science and Engineering) [1480]

Show simple item record