Recently added

Comparion of different POS tagging technique (N-Gram, HMM and Brill's tagger) for Bangla

Hasan, Fahim Muhammad; Naushad UzZaman; Khan, Mumit (BRAC University, 2006)

There are different approaches to the problem of assigning each word of a text with a parts-of-speech tag, which is known as Part-Of-Speech (POS) tagging. In this paper we compare the performance of a few POS tagging ...

Bangla text input and rendering supports for short message sevice on Mobile devices

Rownok, Tofazzal; Islam, Md. Zahurul; Khan, Mumit (BRAC University, 2006)

Technology is the most important thing that involve in our everyday life. It is involving in almost every aspect of life like communication, work, shopping, recreation etc. Communication through mobile devices is the most ...

Analysis of N-Gram based text categorization for Bangla in a newspaper

Mansur, Munirul; UzZaman, Naushad; Khan, Mumit (BRAC University, 2006)

In this paper, we study the outcome of using ngram based algorithm for Bangla text categorization. To analyze the efficiency of this methodology we used one year Prothom-Alo news corpus. Our results show that n-grams of ...

Analysis of and observations from a Bangla News Corpus

Majumder, Khair Md. Yasir Arafat (BRAC University, 2006)

In this paper we present the compilation methodology and some statistical analysis on a Bangla news corpus-“Prothom-Alo”, which is the first of its kind for Bangla. We compare some of the statistics with the CIIL Bangla ...

Morphological analysis of inflecting compound words in Bangla

Dasgupta, Sajib; Khan, Naira; Sarkar, Asif Iqbal; Pavel, Dewan Shahriar Hossain; Khan, Mumit (BRAC University, 2005)

The addition of inflectional suffixes in Bangla com-pound words is fairly complex. A compound is a word that is formed by two or more different words acting as a single entity. One of the key distinguishing features of ...

Morphological parsing of Bangla wods using PC-KIMMO

Dasgupta, Sajib; Khan,Mumit (BRAC University, 2004)

This paper describes Morphological parsing of Bangla words using PC-KIMMO, based on Kimmo Koskeniemil's model of two-level Morphology. There are three sections in the PC-KIMMO: rules section lexicon section and grammar ...

Localization birdging the digital divide

Haque, Nafid (BRAC University, 2007)

In this paper, a proposal has been given to make an equitable access of education and technology among the people of the underdeveloped and developing countries of the world. Here a concept called localization has been ...

Isolated and continuous bangla speech recognition: implementation, performance and application perspective

Hasnat, Md. Abul; Mowla, Jabir; Khan, Mumit (BRAC University, 2007)

Research on automatic speech recognition has been approach progressively since 1930 and the major advances are since 1980 with the introduction of the statistical modeling of speech with the key technology Hidden Markov ...

Comparison of Unigram, Bigram, HMM and Brill's POS tagging approaches for some South Asian languages

Hasan, Muhammad Fahim; Naushad UzZaman; Khan, Mumit (BRAC University, 2007)

Part-of-Speech (POS) Tagging is a process that attaches each word in a sentence with a suitable tag from a given set of tags. POS Tagging is important in various areas of Natural Language Processing. Different methods of ...

Rule based segmentation of lower modifiers in complex Bangla scripts

Hasnat, Md. Abul; Khan, Mumit (BRAC University, 2009)

Segmentation is the most challenging part of Bangla optical character recognition (OCR). To solve the problems of joining errors, several algorithms have been proposed in the literature, with varying degrees of accuracy. ...

Building a foundation of HPSG-based treebank on Bangla language

Mahmud, Altaf; Khan, Mumit (BRAC University, 2007)

Now a day, the importance of a large annotated corpus for NLP researchers is widely known. In this paper, we describe an initial phase of developing a linguistically annotated corpus for non-configurational ‘Bangla’ language. ...

A light weight stemmer for Bengali and its use in spelling checker

Islam, Md. Zahurul; Uddin, Md. Nizam; Khan, Mumit (BRAC University, 2007)

Stemming is an operation that splits a word into the constituent root part and affix without doing complete morphological analysis. It is used to improve the performance of spelling checkers and information retrieval ...

A high performance domain specific OCR for Bangla script

Hasnat, Md. Abul; Habib, S. M. Murtoza; Khan, Mumit (BRAC University, 2007)

Research on recognizing Bengali script has been started since mid 1980’s. A variety of different techniques have been applied and the performance is examined. In this paper we present a high performance domain specific OCR ...

Skew angle detection of bangla script using radon transform

Habib, S. M. Murtoza; Noor, Nawsher Ahamed; Khan, Mumit (BRAC University, 2006)

Skew angle detection and correction an integral part of any OCR system. Without proper skew correction, the performance of an OCR will simply not be acceptable for most scanned images. We propose an innovative method for ...

GIS Based Real Time Traveler Information System: An Efficient Approach to Minimize Travel Time Using Available Media

Hasnat, Md. Abul; Haque, Mohammad Mahmudul; Khan, Mumit (BRAC University, 2006)

This paper addresses the issue of building a Bangla lexicon with a collaborative effort through stand alone application and web based interface. The words in the lexicon will be annotated with a combination of tags addressing ...

Teaching compiler development to undergraduates using a template based approach

Islam, Md Zahurul; Khan, Mumit (BRAC University, 2005)

Compiler Design remains one of the most dreaded courses in any undergraduate Computer Science curriculum, due in part to the complexity and the breadth of the material covered in a typical 14-15 week semester time frame. ...

T12: an advanced text input system with phonetic support for mobile devices

Naushad UzZaman, Khan Mumit (BRAC University, 2005)

The popular T9 text input system for mobile devices uses a predictive dictionary-based disambiguation scheme, enabling a user to type in commonly-used words with low overhead. We present a new text input system called ...

A double metaphone encoding for Bangla and its application in spelling checker

Naushad UzZaman; Khan, Mumit (BRAC University, 2005)

We present a Double Metaphone encoding for Bangla that can be used by spelling checkers to improve the quality of suggestions for misspelled words. The complex rules of Bangla spelling present a significant challenge in ...

A double metaphone encoding for approximate name searching and matching in Bangla

Naushad UzZaman,; Khan, Mumit (BRAC University, 2005)

Almost any word can be a Bangali name, and the name in turn is often spelled in many different ways, all of which are considered correct and interchangeable. The reason for the spelling complication is two-fold: (1) there ...

Feature unification for morphological parsing in Bangla

Dasgupta, Sajib; Khan, Dr. Mumit (BRAC University, 2004)

This paper describes a Feature Unification Based Word Grammar model for the morphological parsing of Bangla words. While normal morphological parsing strategy is adequate to decompose a word into morphemes, it is not able ...

Conference Papers (Centre for Research on Bangla Language Processing): Recent submissions

Comparion of different POS tagging technique (N-Gram, HMM and Brill's tagger) for Bangla ﻿

Bangla text input and rendering supports for short message sevice on Mobile devices ﻿

Analysis of N-Gram based text categorization for Bangla in a newspaper ﻿

Analysis of and observations from a Bangla News Corpus ﻿

Morphological analysis of inflecting compound words in Bangla ﻿

Morphological parsing of Bangla wods using PC-KIMMO ﻿

Localization birdging the digital divide ﻿

Isolated and continuous bangla speech recognition: implementation, performance and application perspective ﻿

Comparison of Unigram, Bigram, HMM and Brill's POS tagging approaches for some South Asian languages ﻿

Rule based segmentation of lower modifiers in complex Bangla scripts ﻿

Building a foundation of HPSG-based treebank on Bangla language ﻿

A light weight stemmer for Bengali and its use in spelling checker ﻿

A high performance domain specific OCR for Bangla script ﻿

Skew angle detection of bangla script using radon transform ﻿

GIS Based Real Time Traveler Information System: An Efficient Approach to Minimize Travel Time Using Available Media ﻿

Teaching compiler development to undergraduates using a template based approach ﻿

T12: an advanced text input system with phonetic support for mobile devices ﻿

A double metaphone encoding for Bangla and its application in spelling checker ﻿

A double metaphone encoding for approximate name searching and matching in Bangla ﻿

Feature unification for morphological parsing in Bangla ﻿