Bengali character recognition using feature extraction
AuthorArif, Samiur Rahman
MetadataShow full item record
The Character Recognition Problem can be assumed as a classification task in which a (portion of an) image is to be given a label among a set of possible labels that represent the characters under consideration. This is the fundamental aspect of feature extraction technique .This generic formulation may lead to quite different settings. Also, if the images of the characters can be obtained optically, we speak of “Optical Character Recognition” (OCR), as opposed to other settings in which input data is obtained by other means. OCR itself can be considered as a subtask of the more general problem of “Document Analysis or Understanding”, where the goal is to obtain a symbolic representation of a digital image of the document under consideration that include not only the recognized text (characters), but also other document components and their relationship. In this thesis I will discuss various feature extraction techniques and later I will see how zoning can be used to build an efficient Bengali character recognition system. Different feature extraction techniques are used to recognize different representations of characters for example binary characters, character contours, skeletons (thinned characters) or gray level sub images of each individual character. The feature extraction methods are distinguished in terms of invariance properties, re-constructability and expected distortions and variability of characters. When a feature extraction method is chosen we need to consider it in terms of efficient application of the system and time consideration for building such system.