• Login
    • Library Home
    View Item 
    •   BracU IR
    • Centre for Research on Bangla Language Processing (CRBLP)
    • Conference Papers (Centre for Research on Bangla Language Processing)
    • View Item
    •   BracU IR
    • Centre for Research on Bangla Language Processing (CRBLP)
    • Conference Papers (Centre for Research on Bangla Language Processing)
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Rule based segmentation of lower modifiers in complex Bangla scripts

    Thumbnail
    View/Open
    Rule based segmentation of lower modifiers in complex Bangla scripts, 2009.pdf (221.0Kb)
    Date
    2009
    Publisher
    BRAC University
    Author
    Hasnat, Md. Abul
    Khan, Mumit
    Metadata
    Show full item record
    URI
    http://hdl.handle.net/10361/338
    Abstract
    Segmentation is the most challenging part of Bangla optical character recognition (OCR). To solve the problems of joining errors, several algorithms have been proposed in the literature, with varying degrees of accuracy. The selection of the lower modifier container units and the subsequent extraction of the modifiers from the core unit during segmentation have not been studied extensively. We present a dissection based lower modifier segmentation method which solves the problem of segmenting lower modifiers under a wide range of document images. A key goal in our methodology is to avoid over-segmentation of the units that do not actually contain any lower modifier, leading to unacceptably high error rates during segmentation. Our methodology consists of four tasks: we first identify the lower modifier separator line using character height information, and then select the primary lower modifier containers; we filter this set to eliminate the units/characters that do not actually contain any lower modifier; we then extract the lower modifier unit using the features of the core units and the lower modifiers; the final step consists of a set of empirical rules, aided by dictionary lookups, to eliminate most of the errors, resulting in an accuracy of 99.6%.
    Keywords
    Description
    Includes bibliographical references (page 5).
    Department
    Center for Research on Bangla Language Processing (CRBLP), BRAC University
    Collections
    • Conference Papers (Centre for Research on Bangla Language Processing)

    Copyright © 2008-2019 Ayesha Abed Library, Brac University 
    Contact Us | Send Feedback
     

     

    Policy Guidelines

    • BracU Policy
    • Publisher Policy

    Browse

    All of BracU Institutional RepositoryCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

    My Account

    LoginRegister

    Statistics

    View Usage Statistics

    Copyright © 2008-2019 Ayesha Abed Library, Brac University 
    Contact Us | Send Feedback