Search

Now showing items 11-20 of 51

Automatic Bangla corpus creation

Sarkar, Asif Iqbal; Pavel, Dewan Shahriar Hossain; Khan, Mumit (BRAC University, 2007)

This paper addresses the issue of automatic Bangla corpus creation, which will significantly help the processes of Lexicon development, Morphological Analysis, Automatic Parts of Speech Detection and Automatic grammar ...

Research report on Translations of gTLDs and ccTLDs in Bangla

Alam, Firoj; Habib, Murtoza; Hayder, Kamrul; Khan, Mumit (BRAC University, 2007-10-08)

This report describes the initial translations of gTLDs and ccTLDs in Bengali, along with the challenges faced in creating the translations.

Research report on parallel corpus translation challenges and processes

Khan, Mumit (BRAC University, 2007-10-08)

We describe some of the challenges in developing English-Bangla parallel corpora, and look some of the established processes used by other language corpora for solutions to some of these challenges.

A light weight stemmer for Bengali and its use in spelling checker

Islam, Md. Zahurul; Uddin, Md. Nizam; Khan, Mumit (BRAC University, 2007)

Stemming is an operation that splits a word into the constituent root part and affix without doing complete morphological analysis. It is used to improve the performance of spelling checkers and information retrieval ...

Comparison of Unigram, Bigram, HMM and Brill's POS tagging approaches for some South Asian languages

Hasan, Muhammad Fahim; Naushad UzZaman; Khan, Mumit (BRAC University, 2007)

Part-of-Speech (POS) Tagging is a process that attaches each word in a sentence with a suitable tag from a given set of tags. POS Tagging is important in various areas of Natural Language Processing. Different methods of ...

A comprehensive Bangla spelling checker

Naushad UzZaman,; Khan, Mumit (BRAC University, 2006)

We present a comprehensive Bangla spelling checker that improves the quality of suggestions for misspelled words. The complex rules for Bangla spelling presents a significant challenge in producing suggestions for a ...

Text normalization system for Bangla

Alam, Firoj; Habib, S. M. Murtoza; Khan, Mumit (BRAC University, 2008)

This paper describes a process of text normalization system of Bangla language (exonym: Bengali) by identifying the semiotic classes from Bangla text corpus. After identifying the semiotic classes a set of rules were ...

Research report on Bengla tagged lexicon

Hayder, Kamrul; Islam, Md Zahurul; Khan, Mumit (BRAC University, 2007)

This report describes the design and implementation of a Bangla tagged lexicon. The resulting lexicon contains 144,770 entries, out of which 58,145 are verbs. The tags used in the lexicon are reproduced here from a previous ...

BWN- A software platform for developing Bengali wordnet

Khan, Mumit; Faruqe, Farhana (BRAC University, 2008)

Advanced Natural Language Processing (NLP) applications are increasingly dependent on the availability of linguistic resources, ranging from digital lexica to rich tagged and annotated corpora. While these resources are ...

Research report on Bengla tagset

Mahmud, Altaf; Khan, Mumit (BRAC University, 2007)

This report describes the design of a POS tagset for Bangla, based on the Penn Treebank design. The resulting tagset contains 53 morpho-syntactic tags.

1
2
3
4
5
. . .
6