Now showing items 36-55 of 63

    • Minimally segmenting performance Bangla optical character recognition using Kohonen network 

      Shatil, Adnan Mohammad Shoeb; Khan, Mumit (BRAC University, 2006)
      This paper presents a method to use Kohonen neural network based classifier in Bangla Optical Character Recognition (OCR) system, providing much higher performance than the traditional neural network based ones. It describes ...
    • Morphological analysis of inflecting compound words in Bangla 

      Dasgupta, Sajib; Khan, Naira; Sarkar, Asif Iqbal; Pavel, Dewan Shahriar Hossain; Khan, Mumit (BRAC University, 2005)
      The addition of inflectional suffixes in Bangla com-pound words is fairly complex. A compound is a word that is formed by two or more different words acting as a single entity. One of the key distinguishing features of ...
    • Morphological parsing of Bangla wods using PC-KIMMO 

      Dasgupta, Sajib; Khan,Mumit (BRAC University, 2004)
      This paper describes Morphological parsing of Bangla words using PC-KIMMO, based on Kimmo Koskeniemil's model of two-level Morphology. There are three sections in the PC-KIMMO: rules section lexicon section and grammar ...
    • N-gram based statistical grammar checker for Bangla and English 

      Alam, Md. Jahangir; UzZaman, Naushad; Khan, Mumit (Center for research on Bangla language processing (CRBLP), BRAC University, 2006)
      This paper describes a statistical grammar checker, which considers the n-gram based analysis of words and POS tags to decide whether the sentence is grammatically correct or not. We employed this technique for both Bangla ...
    • Optical character recognition for Bangla documents using HMM 

      Monjel, Md. Sheemam; Khan, Mumit (BRAC University, 2007)
      In this paper we have described an OCR program made for Bangla documents. This program uses HMM for the recognition process. The description of full OCR program is too large to present here. So, we have given emphasis on ...
    • A proposed automated extraction procedure of Bangla text for corpus creation in unicode 

      Pavel, Dewan Shahriar Hossain; Sarkar, Asif Iqbal; Khan, Mumit (BRAC University, 2006)
      This paper addresses the issue of automated Bangla corpus creation, which will significantly help the processes of lexicon development, morphological analysis, automatic parts of speech detection and automatic grammar ...
    • Research report on Bangla optical character recognition using Kohonen network 

      Shatil, Adnan Md. Shoeb (BRAC University, 2007)
      This report discusses the theory and implementation of an Optical Character Recognition (OCR) for Bangla. The principal idea is to convert images of text documents such as those obtained from scanning a document into ...
    • Research report on Bangla wordnet development challenges and solutions 

      Khan, Mumit (BRAC University, 2007-10-08)
      We describe the initial design of Bangla WordNet (BWN), based on the English WordNet 2.2 distribution from Princeton University. Our goal is to develop a 5,000 entry Bangla WordNet over the next two years. At present, we ...
    • Research report on Bengali NLP engine for TTS 

      Alam, Firoj (BRAC University, 2008-04-07)
      This report describes the Bengali NLP processor for TTS, along with the challenges faced in developing the NLP processor.
    • Research report on Bengla lexicon 

      Hayder, Kamrul (BRAC University, 2007)
      We report on the compilation of a comprehensive Bangla word list lexicon. The current list contains 80,969 words from the Standard Chalita Bhasha (SCB) vocabulary. The word list is currently being used by the BRAC University ...
    • Research report on Bengla OCR training and testing methods 

      Hasnat, Md. Abul (BRAC University, 2007)
      In this paper we present the training and recognition mechanism of a Hidden Markov Model (HMM) based multi-font Optical Character Recognition (OCR) system for Bengali character. In our approach, the central idea is to ...
    • Research report on Bengla tagged lexicon 

      Hayder, Kamrul; Islam, Md Zahurul; Khan, Mumit (BRAC University, 2007)
      This report describes the design and implementation of a Bangla tagged lexicon. The resulting lexicon contains 144,770 entries, out of which 58,145 are verbs. The tags used in the lexicon are reproduced here from a previous ...
    • Research report on Bengla tagset 

      Mahmud, Altaf; Khan, Mumit (BRAC University, 2007)
      This report describes the design of a POS tagset for Bangla, based on the Penn Treebank design. The resulting tagset contains 53 morpho-syntactic tags.
    • Research report on Bengla Verb and Noun Morphological analysis 

      Islam, Md. Zahurul (BRAC University, 2007)
      This report describes the inflection Bangla verb and noun morphology and rules, lexicons and grammar for Bangla morphological analysis.
    • Research report on parallel corpus translation challenges and processes 

      Khan, Mumit (BRAC University, 2007-10-08)
      We describe some of the challenges in developing English-Bangla parallel corpora, and look some of the established processes used by other language corpora for solutions to some of these challenges.
    • Research report on Translations of gTLDs and ccTLDs in Bangla 

      Alam, Firoj; Habib, Murtoza; Hayder, Kamrul; Khan, Mumit (BRAC University, 2007-10-08)
      This report describes the initial translations of gTLDs and ccTLDs in Bengali, along with the challenges faced in creating the translations.
    • Rule based automated pronunciation generator 

      Mosaddeque, Ayesha Binte; UzZaman, Naushad; Khan, Mumit (BRAC University, 2006)
      This paper presents a rule based ronunciation generator for Bangla words. It takes a word and finds the pronunciations for the graphemes of the word. A grapheme is a unit in writing that cannot be analyzed into smaller ...
    • Rule based segmentation of lower modifiers in complex Bangla scripts 

      Hasnat, Md. Abul; Khan, Mumit (BRAC University, 2009)
      Segmentation is the most challenging part of Bangla optical character recognition (OCR). To solve the problems of joining errors, several algorithms have been proposed in the literature, with varying degrees of accuracy. ...
    • Segmentation free Bangla OCR using HMM: Training and recognition 

      Hasnat, Md. Abul; Habib, S. M. Murtoza; Khan, Mumit (BRAC University, 2007)
      The wide area of the application of HMM is in Speech Recognition where each spoken word is considered as a single unit to be recognized from the trained word network. Using this concept some research has been done for ...
    • Skew angle detection of bangla script using radon transform 

      Habib, S. M. Murtoza; Noor, Nawsher Ahamed; Khan, Mumit (BRAC University, 2006)
      Skew angle detection and correction an integral part of any OCR system. Without proper skew correction, the performance of an OCR will simply not be acceptable for most scanned images. We propose an innovative method for ...