BRAC University Institutional Repository

Integrating Bangla script recognition support in tesseract OCR

DSpace/Manakin Repository

Show simple item record

dc.contributor.author Hasnat, Md. Abul
dc.contributor.author Chowdhury, Muttakinur Rahman
dc.contributor.author Khan, Mumit
dc.date.accessioned 2010-10-25T06:03:34Z
dc.date.available 2010-10-25T06:03:34Z
dc.date.copyright 2009
dc.date.issued 2009
dc.identifier.uri http://hdl.handle.net/10361/635
dc.description Includes bibliographical references (page 5).
dc.description.abstract Tesseract is considered one of the most accurate free software OCR engines currently available. It was originally developed by Hewlett-Packard from 1985 until 1995, and is currently maintained by Google. At present, Tesseract is capable of only recognizing English, French, Italian, German, Spanish and Dutch. However, it is possible to make Tesseract recognize other scripts if the engine is trained with the requisite data. In this paper, we present a complete methodology to integrate Bangla script recognition support in Tesseract. en_US
dc.description.statementofresponsibility Md. Abul Hasnat
dc.description.statementofresponsibility Muttakinur Rahman Chowdhury
dc.description.statementofresponsibility Mumit Khan
dc.format.extent 5 pages
dc.language.iso en en_US
dc.publisher BRAC University en_US
dc.subject Optical character reader (OCR)
dc.subject Bangla language processing
dc.title Integrating Bangla script recognition support in tesseract OCR en_US
dc.type Article en_US
dc.contributor.department Center for Research on Bangla Language Processing (CRBLP), BRAC University


Files in this item

Files Size Format View
Integrating Bangla script.pdf 237.1Kb PDF View/Open or Preview

This item appears in the following Collection(s)

Show simple item record

Policy Guidelines

Search DSpace


Browse

My Account