BRAC University Institutional Repository

Syntactic part of speech tagging guidelines for Bangla text

DSpace/Manakin Repository

Show simple item record

dc.contributor.author Mahmud, Altaf
dc.contributor.author Khan, Mumit
dc.date.accessioned 2010-10-27T04:29:02Z
dc.date.available 2010-10-27T04:29:02Z
dc.date.issued 2009
dc.identifier.uri http://hdl.handle.net/10361/640
dc.description.abstract Recently, several techniques have been tested to automatically assign part-of-speeches to Bangla texts using different tag sets. But there is always a need for a standard tagset for Bangla that has been formally published for syntactical bracketing, along with a details POS tagging guideline for the annotators which shows how a word should be tagged in a particular context. This document presents a guideline for annotating Bangla text by part-of-speech to assist the syntactical bracketing task. This tagset consists of total 55 tags tried to precisely distribute all of the required syntactic categories and encode necessary syntactic information to facilitate advanced linguistic analysis of a morphologically rich and flexible word ordered language. After trained a simple Brill tagger on a manually tagged corpus consists of around 25,000 words, overall accuracy has been achieved 70.6% which is comparable to minimal standard set by different experimental results using any simple supervised learning method on Bangla text. en_US
dc.language.iso en en_US
dc.publisher Center for research on Bangla language processing (CRBLP), BRAC University en_US
dc.title Syntactic part of speech tagging guidelines for Bangla text en_US
dc.type Technical Report en_US


Files in this item

Files Size Format View
altaf_POS_Guideline_2009.pdf 691.6Kb PDF View/Open or Preview

This item appears in the following Collection(s)

Show simple item record

Policy Guidelines

Search DSpace


Browse

My Account