Search
Now showing items 31-40 of 63
Integrating Bangla script recognition support in tesseract OCR
(BRAC University, 2009)
Tesseract is considered one of the most accurate free software OCR engines currently available. It was originally developed by Hewlett-Packard from 1985 until 1995, and is currently maintained by Google. At present, Tesseract ...
Example based English-Bengali machine translation using wordnet
(BRAC University, 2009)
In this paper we propose an architecture of English-Bengali Example Based Machine Translation (EBMT) using WordNet. The proposed EBMT system has five steps: 1) Tagging 2) Parsing 3) Prepare the chunks of the sentence using ...
Analysis of N-Gram based text categorization for Bangla in a newspaper
(BRAC University, 2006)
In this paper, we study the outcome of using ngram based algorithm for Bangla text categorization. To analyze the efficiency of this methodology we used one year Prothom-Alo news corpus. Our results show that n-grams of ...
A high performance domain specific OCR for Bangla script
(BRAC University, 2007)
Research on recognizing Bengali script has been started since mid 1980’s. A variety of different techniques have been applied and the performance is examined. In this paper we present a high performance domain specific OCR ...
Development of annotated Bangla speech corpora
(BRAC University, 2010)
This paper describes the development procedure of three different Bangla read speech corpora which can be used for
phonetic research and developing speech applications. Several criteria were maintained in the corpora ...
Building a foundation of HPSG-based treebank on Bangla language
(BRAC University, 2007)
Now a day, the importance of a large annotated corpus for NLP researchers is widely known. In this paper, we describe an initial phase of developing a linguistically annotated corpus for non-configurational ‘Bangla’ language. ...
A light weight stemmer for Bengali and its use in spelling checker
(BRAC University, 2007)
Stemming is an operation that splits a word into the constituent root part and affix without doing complete morphological analysis. It is used to improve the performance of spelling checkers and information retrieval ...
Rule based segmentation of lower modifiers in complex Bangla scripts
(BRAC University, 2009)
Segmentation is the most challenging part of Bangla optical character recognition (OCR). To solve the problems of joining errors, several algorithms have been proposed in the literature, with varying degrees of accuracy. ...
Feature unification for morphological parsing in Bangla
(BRAC University, 2004)
This paper describes a Feature Unification Based Word Grammar model for the morphological parsing of Bangla words. While normal morphological parsing strategy is adequate to decompose a word into morphemes, it is not able ...
Comparion of different POS tagging technique (N-Gram, HMM and Brill's tagger) for Bangla
(BRAC University, 2006)
There are different approaches to the problem of assigning each word of a text with a parts-of-speech
tag, which is known as Part-Of-Speech (POS) tagging. In this paper we compare the performance of a few POS tagging ...