Building a foundation of HPSG-based treebank on Bangla language

Mahmud, Altaf; Khan, Mumit

View/Open

Building a Foundation of HPSG.pdf (141.0Kb)

Date

2007

Publisher

BRAC University

Abstract

Now a day, the importance of a large annotated corpus for NLP researchers is widely known. In this paper, we describe an initial phase of developing a linguistically annotated corpus for non-configurational ‘Bangla’ language. Since, the formalism differs from those posited for configurational languages; several features have been added for constraint based parsing through HPSG-based formalism. We propose an outline of a semi-automated process by applying both case marking approach and some morphological analysis to constraint the parsing of a relatively free word order language for creating a linguistically rich, highly-lexicalized annotated corpus.

Keywords

Treebank; Hpsg,; parsing; non-configuration; Treebanking

Description

Includes bibliographical references (page 6).

Department

Center for Research on Bangla Language Processing (CRBLP), BRAC University

Type

Article

Collections

Conference Papers (Centre for Research on Bangla Language Processing) [40]