BRAC University Institutional Repository

Syntactic part of speech tagging guidelines for Bangla text

Show simple item record Mahmud, Altaf Khan, Mumit 2010-10-27T04:29:02Z 2010-10-27T04:29:02Z 2009 2009
dc.description Includes bibliographical references (page 68).
dc.description.abstract Recently, several techniques have been tested to automatically assign part-of-speeches to Bangla texts using different tag sets. But there is always a need for a standard tagset for Bangla that has been formally published for syntactical bracketing, along with a details POS tagging guideline for the annotators which shows how a word should be tagged in a particular context. This document presents a guideline for annotating Bangla text by part-of-speech to assist the syntactical bracketing task. This tagset consists of total 55 tags tried to precisely distribute all of the required syntactic categories and encode necessary syntactic information to facilitate advanced linguistic analysis of a morphologically rich and flexible word ordered language. After trained a simple Brill tagger on a manually tagged corpus consists of around 25,000 words, overall accuracy has been achieved 70.6% which is comparable to minimal standard set by different experimental results using any simple supervised learning method on Bangla text. en_US
dc.description.statementofresponsibility Altaf Mahmud
dc.description.statementofresponsibility Mumit Khan
dc.format.extent 73 pages
dc.language.iso en en_US
dc.publisher BRAC University en_US
dc.subject Bangla language processing
dc.title Syntactic part of speech tagging guidelines for Bangla text en_US
dc.type Technical report en_US
dc.contributor.department Center for Research on Bangla Language Processing (CRBLP), BRAC University

Files in this item

This item appears in the following Collection(s)

Show simple item record

Policy Guidelines

Search BRACU Repository

Advanced Search


My Account