BRAC University Institutional Repository

Analysis of N-Gram based text categorization for Bangla in a newspaper

DSpace/Manakin Repository

Show simple item record

dc.contributor.author Mansur, Munirul
dc.contributor.author UzZaman, Naushad
dc.contributor.author Khan, Mumit
dc.date.accessioned 2010-10-21T09:14:58Z
dc.date.available 2010-10-21T09:14:58Z
dc.date.issued 2006
dc.identifier.uri http://hdl.handle.net/10361/623
dc.description.abstract In this paper, we study the outcome of using ngram based algorithm for Bangla text categorization. To analyze the efficiency of this methodology we used one year Prothom-Alo news corpus. Our results show that n-grams of length 2 or 3 are the most useful for categorization. Using gram lengths more than 3reduces the performance of categorization. en_US
dc.language.iso en en_US
dc.publisher Center for research on Bangla language processing (CRBLP), BRAC University en_US
dc.title Analysis of N-Gram based text categorization for Bangla in a newspaper en_US
dc.type Article en_US


Files in this item

Files Size Format View
Analysis of N-gram based.pdf 229.1Kb PDF View/Open or Preview

This item appears in the following Collection(s)

Show simple item record

Policy Guidelines

Search DSpace


Browse

My Account