Now showing items 21-40 of 63

    • Analysis of and observations from a Bangla News Corpus 

      Majumder, Khair Md. Yasir Arafat (BRAC University, 2006)
      In this paper we present the compilation methodology and some statistical analysis on a Bangla news corpus-“Prothom-Alo”, which is the first of its kind for Bangla. We compare some of the statistics with the CIIL Bangla ...
    • Analysis of N-Gram based text categorization for Bangla in a newspaper 

      Mansur, Munirul; UzZaman, Naushad; Khan, Mumit (BRAC University, 2006)
      In this paper, we study the outcome of using ngram based algorithm for Bangla text categorization. To analyze the efficiency of this methodology we used one year Prothom-Alo news corpus. Our results show that n-grams of ...
    • GIS Based Real Time Traveler Information System: An Efficient Approach to Minimize Travel Time Using Available Media 

      Hasnat, Md. Abul; Haque, Mohammad Mahmudul; Khan, Mumit (BRAC University, 2006)
      This paper addresses the issue of building a Bangla lexicon with a collaborative effort through stand alone application and web based interface. The words in the lexicon will be annotated with a combination of tags addressing ...
    • Infrastructure for Bangla information retrieval in context of ICT for development 

      Haque, Nafid; Ali, M Hammad; Abduallah, Matin Saad (BRAC University, 2006)
      In this paper, we talk about developing a search engine and information retrieval system for Bangla. Current work done in this area assumes the use of a particular type of encoding or the availability of particular facilities ...
    • History (Forward N-Gram) or future (Backward N-Gram)? Which model to consider for N-Gram analysis in Bangla? 

      Khan, Naira; Habib, Md. Tarek; Alam, Md. Jahangir; Rahman, Rajib; UzZaman, Naushad; Khan, Mumit (BRAC University, 2006)
      This paper presents a directional advantage of n-gram modeling in terms of backward or forward n-gram modeling in Bangla. The most commonly used n-gram analysis is predominantly a forward n-gram. However in Bangla it appears ...
    • Bangla text input and rendering supports for short message sevice on Mobile devices 

      Rownok, Tofazzal; Islam, Md. Zahurul; Khan, Mumit (BRAC University, 2006)
      Technology is the most important thing that involve in our everyday life. It is involving in almost every aspect of life like communication, work, shopping, recreation etc. Communication through mobile devices is the most ...
    • A proposed automated extraction procedure of Bangla text for corpus creation in unicode 

      Pavel, Dewan Shahriar Hossain; Sarkar, Asif Iqbal; Khan, Mumit (BRAC University, 2006)
      This paper addresses the issue of automated Bangla corpus creation, which will significantly help the processes of lexicon development, morphological analysis, automatic parts of speech detection and automatic grammar ...
    • Building a foundation of HPSG-based treebank on Bangla language 

      Mahmud, Altaf; Khan, Mumit (BRAC University, 2007)
      Now a day, the importance of a large annotated corpus for NLP researchers is widely known. In this paper, we describe an initial phase of developing a linguistically annotated corpus for non-configurational ‘Bangla’ language. ...
    • Research report on Bangla optical character recognition using Kohonen network 

      Shatil, Adnan Md. Shoeb (BRAC University, 2007)
      This report discusses the theory and implementation of an Optical Character Recognition (OCR) for Bangla. The principal idea is to convert images of text documents such as those obtained from scanning a document into ...
    • Text to speech for Bangla language using festival 

      Alam, Firoj; Nath, Promila Kanti; Khan, Mumit (BRAC University, 2007)
      In this paper, we present a Text to Speech (TTS) synthesis system for Bangla language using the opensource Festival TTS engine. Festival is a complete TTS synthesis system, with components supporting front-end processing ...
    • Segmentation free Bangla OCR using HMM: Training and recognition 

      Hasnat, Md. Abul; Habib, S. M. Murtoza; Khan, Mumit (BRAC University, 2007)
      The wide area of the application of HMM is in Speech Recognition where each spoken word is considered as a single unit to be recognized from the trained word network. Using this concept some research has been done for ...
    • Automatic Bangla corpus creation 

      Sarkar, Asif Iqbal; Pavel, Dewan Shahriar Hossain; Khan, Mumit (BRAC University, 2007)
      This paper addresses the issue of automatic Bangla corpus creation, which will significantly help the processes of Lexicon development, Morphological Analysis, Automatic Parts of Speech Detection and Automatic grammar ...
    • A light weight stemmer for Bengali and its use in spelling checker 

      Islam, Md. Zahurul; Uddin, Md. Nizam; Khan, Mumit (BRAC University, 2007)
      Stemming is an operation that splits a word into the constituent root part and affix without doing complete morphological analysis. It is used to improve the performance of spelling checkers and information retrieval ...
    • Comparison of Unigram, Bigram, HMM and Brill's POS tagging approaches for some South Asian languages 

      Hasan, Muhammad Fahim; Naushad UzZaman; Khan, Mumit (BRAC University, 2007)
      Part-of-Speech (POS) Tagging is a process that attaches each word in a sentence with a suitable tag from a given set of tags. POS Tagging is important in various areas of Natural Language Processing. Different methods of ...
    • Research report on Bengla tagged lexicon 

      Hayder, Kamrul; Islam, Md Zahurul; Khan, Mumit (BRAC University, 2007)
      This report describes the design and implementation of a Bangla tagged lexicon. The resulting lexicon contains 144,770 entries, out of which 58,145 are verbs. The tags used in the lexicon are reproduced here from a previous ...
    • Research report on Bengla tagset 

      Mahmud, Altaf; Khan, Mumit (BRAC University, 2007)
      This report describes the design of a POS tagset for Bangla, based on the Penn Treebank design. The resulting tagset contains 53 morpho-syntactic tags.
    • Localization birdging the digital divide 

      Haque, Nafid (BRAC University, 2007)
      In this paper, a proposal has been given to make an equitable access of education and technology among the people of the underdeveloped and developing countries of the world. Here a concept called localization has been ...
    • Isolated and continuous bangla speech recognition: implementation, performance and application perspective 

      Hasnat, Md. Abul; Mowla, Jabir; Khan, Mumit (BRAC University, 2007)
      Research on automatic speech recognition has been approach progressively since 1930 and the major advances are since 1980 with the introduction of the statistical modeling of speech with the key technology Hidden Markov ...
    • Research report on Bengla Verb and Noun Morphological analysis 

      Islam, Md. Zahurul (BRAC University, 2007)
      This report describes the inflection Bangla verb and noun morphology and rules, lexicons and grammar for Bangla morphological analysis.
    • A high performance domain specific OCR for Bangla script 

      Hasnat, Md. Abul; Habib, S. M. Murtoza; Khan, Mumit (BRAC University, 2007)
      Research on recognizing Bengali script has been started since mid 1980’s. A variety of different techniques have been applied and the performance is examined. In this paper we present a high performance domain specific OCR ...