Show simple item record

dc.contributor.advisorUddin, Dr. Jia
dc.contributor.authorKabir, Humayun
dc.contributor.authorAhmed, Ruhan
dc.contributor.authorNasib, Abdullah Umar
dc.date.accessioned2018-02-25T08:00:00Z
dc.date.available2018-02-25T08:00:00Z
dc.date.copyright2017
dc.date.issued2017-12
dc.identifier.otherID 14141004
dc.identifier.otherID 14101042
dc.identifier.otherID 17341001
dc.identifier.urihttp://hdl.handle.net/10361/9546
dc.descriptionThis thesis report is submitted in partial fulfilment of the requirements for the degree of Bachelor of Science in Computer Science and Engineering, 2017.en_US
dc.descriptionCataloged from PDF version of thesis report.
dc.descriptionIncludes bibliographical references (pages 35-37).
dc.description.abstractThis paper aims to demonstrate the use of Speech-to-Text technology to convert Bangla spoken in a natural and continuous state into Bengali UNICODE font with good accuracy. This achievement required the usage of the open sourced framework Sphinx 4 created by Carnegie Melon University (CMU) which was written in Java and provides the required procedural coding tools to develop an acoustic model for a custom language like Bangla. It takes help of algorithms like Baum-Welch to create an Acoustic Model from training data which we gathered ourselves. Our main objective was to ensure that the system was adequately trained on a word by word basis from various speakers so that it could recognize new speakers fluently. We used a free digital audio workstation (DAW) called Audacity to manipulate the collected recording data via techniques like continuous frequency profiling to reduce the Signal-to-Noise-Ratio (SNR), vocal levelling, normalization and syllable splitting as well as merging to ensure an error free 1:1-word mapping of each utterance with its mirror transcription file text. The result is a speech to text recognition system with an acceptable accuracy of around 75% that was trained using recorded speech data from 10 individual speakers consisting of both males and females using custom transcript files that we wrote.en_US
dc.description.statementofresponsibilityHumayun Kabir
dc.description.statementofresponsibilityRuhan Ahmed
dc.description.statementofresponsibilityAbdullah Umar Nasib
dc.format.extent37 pages
dc.language.isoenen_US
dc.publisherBRAC Universityen_US
dc.rightsBRAC University thesis reports are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission.
dc.subjectSpeech-to-text technologyen_US
dc.subjectBengali UNICODEen_US
dc.subjectSphinx 4en_US
dc.subjectCarnegie Melon Universityen_US
dc.subjectBaum-Welchen_US
dc.subjectSignal-to-noise-ratioen_US
dc.titleReal time bengali speech to text conversion using CMU sphinxen_US
dc.typeThesisen_US
dc.contributor.departmentDepartment of Computer Science and Engineering, BRAC University
dc.description.degreeB. Computer Science and Engineering


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record