dc.contributor.author | Hasnat, Md. Abul | |
dc.contributor.author | Chowdhury, Muttakinur Rahman | |
dc.contributor.author | Khan, Mumit | |
dc.date.accessioned | 2010-10-25T06:03:34Z | |
dc.date.available | 2010-10-25T06:03:34Z | |
dc.date.copyright | 2009 | |
dc.date.issued | 2009 | |
dc.identifier.uri | http://hdl.handle.net/10361/635 | |
dc.description | Includes bibliographical references (page 5). | |
dc.description.abstract | Tesseract is considered one of the most accurate free software OCR engines currently available. It was originally developed by Hewlett-Packard from 1985 until 1995, and is currently maintained by Google. At present, Tesseract is capable of only recognizing English, French, Italian, German, Spanish and Dutch. However, it is possible to make Tesseract recognize other scripts if the engine is trained with the requisite data. In this paper, we present a complete methodology to integrate Bangla script recognition support in Tesseract. | en_US |
dc.description.statementofresponsibility | Md. Abul Hasnat | |
dc.description.statementofresponsibility | Muttakinur Rahman Chowdhury | |
dc.description.statementofresponsibility | Mumit Khan | |
dc.format.extent | 5 pages | |
dc.language.iso | en | en_US |
dc.publisher | BRAC University | en_US |
dc.subject | Optical character reader (OCR) | |
dc.subject | Bangla language processing | |
dc.title | Integrating Bangla script recognition support in tesseract OCR | en_US |
dc.type | Article | en_US |
dc.contributor.department | Center for Research on Bangla Language Processing (CRBLP), BRAC University | |