dc.contributor.author | Mansur, Munirul | |
dc.contributor.author | UzZaman, Naushad | |
dc.contributor.author | Khan, Mumit | |
dc.date.accessioned | 2010-10-21T09:14:58Z | |
dc.date.available | 2010-10-21T09:14:58Z | |
dc.date.copyright | 2006 | |
dc.date.issued | 2006 | |
dc.identifier.uri | http://hdl.handle.net/10361/623 | |
dc.description | Includes bibliographical references (page 7). | |
dc.description.abstract | In this paper, we study the outcome of using ngram based algorithm for Bangla text categorization. To analyze the efficiency of this methodology we used one year Prothom-Alo news corpus. Our results show that n-grams of length 2 or 3 are the most useful for categorization. Using gram lengths more than 3reduces the performance of categorization. | en_US |
dc.description.statementofresponsibility | Munirul Mansur | |
dc.description.statementofresponsibility | Naushad UzZaman | |
dc.description.statementofresponsibility | Mumit Khan | |
dc.format.extent | 7 pages | |
dc.language.iso | en | en_US |
dc.publisher | BRAC University | en_US |
dc.title | Analysis of N-Gram based text categorization for Bangla in a newspaper | en_US |
dc.type | Article | en_US |
dc.contributor.department | Center for Research on Bangla Language Processing (CRBLP), BRAC University | |