BRAC University Institutional Repository

Analysis of N-Gram based text categorization for Bangla in a newspaper

DSpace/Manakin Repository

Show simple item record

dc.contributor.author Mansur, Munirul
dc.contributor.author UzZaman, Naushad
dc.contributor.author Khan, Mumit
dc.date.accessioned 2010-10-21T09:14:58Z
dc.date.available 2010-10-21T09:14:58Z
dc.date.copyright 2006
dc.date.issued 2006
dc.identifier.uri http://hdl.handle.net/10361/623
dc.description Includes bibliographical references (page 7).
dc.description.abstract In this paper, we study the outcome of using ngram based algorithm for Bangla text categorization. To analyze the efficiency of this methodology we used one year Prothom-Alo news corpus. Our results show that n-grams of length 2 or 3 are the most useful for categorization. Using gram lengths more than 3reduces the performance of categorization. en_US
dc.description.statementofresponsibility Munirul Mansur
dc.description.statementofresponsibility Naushad UzZaman
dc.description.statementofresponsibility Mumit Khan
dc.format.extent 7 pages
dc.language.iso en en_US
dc.publisher BRAC University en_US
dc.title Analysis of N-Gram based text categorization for Bangla in a newspaper en_US
dc.type Article en_US
dc.contributor.department Center for Research on Bangla Language Processing (CRBLP), BRAC University


Files in this item

Files Size Format View
Analysis of N-gram based.pdf 229.1Kb PDF View/Open or Preview

This item appears in the following Collection(s)

Show simple item record

Policy Guidelines

Search DSpace


Browse

My Account