dc.contributor.author | Shatil, Adnan Md. Shoeb | |
dc.date.accessioned | 2010-10-28T04:08:49Z | |
dc.date.available | 2010-10-28T04:08:49Z | |
dc.date.copyright | 2007 | |
dc.date.issued | 2007 | |
dc.identifier.uri | http://hdl.handle.net/10361/658 | |
dc.description | Includes bibliographical references (page 13). | |
dc.description.abstract | This report discusses the theory and
implementation of an Optical Character Recognition (OCR) for Bangla. The principal idea is to convert images of text documents such as those obtained from scanning a document into editable texts. This report
does not address the pre-processing steps such as skew correction and noise reduction (which is handled in a previous report), so the documents are assumed to pre-processed by another tool in the pipeline. For training and recognition, the input is
then first converted to a binary image, and then into to a 25x25 pixel2 image; the only feature extracted from the images is a 625-bit long vector, which is then trained or classified using a Kohonen neural
network. The OCR shows excellent performance for documents with single typeface. The work in progress is extending it to handle multiple typefaces. | en_US |
dc.description.statementofresponsibility | Adnan Md. Shoeb Shatil | |
dc.format.extent | 13 pages | |
dc.language.iso | en | en_US |
dc.publisher | BRAC University | en_US |
dc.subject | Bangla language processing | |
dc.subject | Bangla OCR | |
dc.title | Research report on Bangla optical character recognition using Kohonen network | en_US |
dc.type | Technical report | en_US |
dc.contributor.department | Center for Research on Bangla Language Processing (CRBLP), BRAC University | |