Show simple item record

dc.contributor.advisorArif, Hossain
dc.contributor.authorBristy, Israt Jerin
dc.contributor.authorShakil, Nadim Imtiaz
dc.contributor.authorMusavee, Tesnim
dc.contributor.authorChoton, Akibur Rahman
dc.date.accessioned2020-01-20T04:24:59Z
dc.date.available2020-01-20T04:24:59Z
dc.date.copyright2019
dc.date.issued2019-08
dc.identifier.otherID 15301006
dc.identifier.otherID 15301037
dc.identifier.otherID 15101110
dc.identifier.otherID 15301102
dc.identifier.urihttp://hdl.handle.net/10361/13632
dc.descriptionThis thesis is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science, 2019.en_US
dc.descriptionCataloged from PDF version of thesis.
dc.descriptionIncludes bibliographical references (pages 30-32).
dc.description.abstractSpeech is the most normal type of communication and association between people while content (text) and images are the most basic types of exchange in the computer system. Therefore, enthusiasm in regards to transformation between speech and text is expanding day by day for integrating the human-computer relation. Understanding speech for a human is not a challenge but for a machine it is a big deal because a machine does not catch expression or human nature. For the conversion of speech into text, this proposed model requires the usage of the open sourced framework Sphinx 4 which is written in Java. For the proposed system, it requires certain steps which are training an acoustic model, creating a language model and building a dictionary with CMUSphinx. For training, the audio files were recorded by 8 speakers both male and female for more accuracy. Among them, 6 speakers recorded each word 3 times. To test the accuracy, we took audio recordings from 2 speakers among them one speaker is unknown to the system. After testing, we got the accuracy around 59.01%. For known speakers we got 78.57% accuracy. We gave audio files as input only to check accuracy as our main purpose was to make a system which works in real time. In our system, user can speak in real time and the system converts it into text.en_US
dc.description.statementofresponsibilityIsrat Jerin Bristy
dc.description.statementofresponsibilityNadim Imtiaz Shakil
dc.description.statementofresponsibilityTesnim Musavee
dc.description.statementofresponsibilityAkibur Rahman Choton
dc.format.extent32 pages
dc.language.isoenen_US
dc.publisherBrac Universityen_US
dc.rightsBrac University theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission.
dc.subjectBanglaen_US
dc.subjectVoice recognitionen_US
dc.subjectCMUSphinxen_US
dc.subjectAcoustic modelen_US
dc.subjectLanguage modelen_US
dc.subject.lcshMachine learning.
dc.titleBangla speech to text conversion using CMU sphinxen_US
dc.typeThesisen_US
dc.contributor.departmentDepartment of Computer Science and Engineering, Brac University
dc.description.degreeB. Computer Science


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record