Show simple item record

dc.contributor.advisorRahman, Md. Shahriar
dc.contributor.advisorRahman, Rafeed
dc.contributor.authorShams, Khan Abrar
dc.contributor.authorReaz, Md. Rafid
dc.contributor.authorIslam, Sanjida
dc.contributor.authorRafi, Mohammad Ryan Ur
dc.date.accessioned2024-06-25T06:48:54Z
dc.date.available2024-06-25T06:48:54Z
dc.date.copyright©2023
dc.date.issued2023-09
dc.identifier.otherID 19201052
dc.identifier.otherID 19201044
dc.identifier.otherID 20101615
dc.identifier.otherID 22241152
dc.identifier.urihttp://hdl.handle.net/10361/23577
dc.descriptionThis thesis is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science and Engineering, 2023.en_US
dc.descriptionCataloged from PDF version of thesis.
dc.descriptionIncludes bibliographical references (pages 49-52).
dc.description.abstractSign language is the most common method of communication for people with disabling hearing loss. Bangladesh, where BdSL is prominently used among the disabling people, finds communicating with the general mass challenging. Thus, a system to understand BdSL accurately and efficiently has become a popular demand. Deep learning architectures such as CNN, ANN, RNN, and Axis Independent LSTM can interpret Bangla Sign Language into readable digital wording. Commonly, an image-based sign language recognition system contains a recording camera that continuously sends images to a model. The model then provides a prediction based on those images. However, it creates a lot of uncertainty variables, such as the lighting issue, noisy background, skin color, and hand orientations. To this end, we propose a procedure that can reduce this uncertainty variable by considering three different modalities, spatial information, skeleton awareness, and edge awareness. We propose three image pre-processing techniques and integrate three convolutional neural network models. Finally, we tested out nine different ensemble meta-learning algorithms where five of the algorithms are modifications of averaging and voting techniques. As a result, our proposed model achieved higher training accuracy at 99.77%, 98.11%, and 99.30% than any other state-of-the-art image classification architectures except for ResNet50 at 99.87%. We achieved the highest accuracy of 95.13% on the testing set. This research shows that considering multiple modalities can improve the system’s overall performance.en_US
dc.description.statementofresponsibilityKhan Abrar Shams
dc.description.statementofresponsibilityMd. Rafid Reaz
dc.description.statementofresponsibilitySanjida Islam
dc.description.statementofresponsibilityMohammad Ryan Ur Rafi
dc.format.extent61 pages
dc.language.isoenen_US
dc.publisherBrac Universityen_US
dc.rightsBrac University theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission.
dc.subjectBangla sign languageen_US
dc.subjectConvolutional neural networken_US
dc.subjectEnsemble methoden_US
dc.subject.lcshNeural networks (Computer science)
dc.subject.lcshComputer linguistics
dc.titleTri-modal ensemble for enhanced Bangla sign language recognitionen_US
dc.typeThesisen_US
dc.contributor.departmentDepartment of Computer Science and Engineering, Brac University
dc.description.degreeB.Sc in Computer Science


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record