• Login
    • Library Home
    View Item 
    •   BracU IR
    • School of Data and Sciences (SDS)
    • Department of Computer Science and Engineering (CSE)
    • Thesis & Report, BSc (Computer Science and Engineering)
    • View Item
    •   BracU IR
    • School of Data and Sciences (SDS)
    • Department of Computer Science and Engineering (CSE)
    • Thesis & Report, BSc (Computer Science and Engineering)
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Bangla character recognition for Android devices

    Thumbnail
    View/Open
    12101113.pdf (2.330Mb)
    Date
    2015-12
    Publisher
    BRAC University
    Author
    Manzur, Shahrin
    Islam, Shafiqul
    Foysal, Abu
    Chowdhury, Aparajita
    Metadata
    Show full item record
    URI
    http://hdl.handle.net/10361/4894
    Abstract
    In this paper, we illustrate our attempt to create editable documents from images by retrieving the text. The process is widely known as Optical Character Recognition (OCR). We have tried to build an Android application for detecting Bengali characters. Previously, several attempts have been made in developing a Bengali OCR. However, there were a few limitations which drove us to work on this project. In order to recognize more characters and joint letters, we decided to work on reducing the error rate to preserve more texts. To serve our purpose, we found the Tesseract OCR engine and Leptonica Image Processing Library to be the best option. Tesseract is used in order to recognize the characters and Leptonica is used to build an Android application by extracting data from the text. We are using the Tesseract 3.03 version currently available to work on this project. Moreover, we demonstrate how we obtained better results by manipulating Tesseract along with Serak to create box files and trained data. In addition to that, we discuss how we dealt with joint letters, dangerous ambiguity and contrast issues in order to increase efficiency. Furthermore, we explain our analyzed data, our progress and the future scopes of improvement.
    Keywords
    Optical Character Recognition (OCR); Tesseract; Bangla language; Android; Leptonica
     
    Description
    This thesis report is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science and Engineering, 2015.
    Type
    Thesis
    Collections
    • Thesis & Report, BSc (Computer Science and Engineering)

    Copyright © 2008-2023 Ayesha Abed Library, Brac University 
    Contact Us | Send Feedback
     

     

    Policy Guidelines

    • BracU Policy
    • Publisher Policy

    Browse

    All of BracU Institutional RepositoryCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

    My Account

    LoginRegister

    Statistics

    View Usage Statistics

    Copyright © 2008-2023 Ayesha Abed Library, Brac University 
    Contact Us | Send Feedback