dc.contributor.advisor | Arif, Hossain | |
dc.contributor.author | Bristy, Israt Jerin | |
dc.contributor.author | Shakil, Nadim Imtiaz | |
dc.contributor.author | Musavee, Tesnim | |
dc.contributor.author | Choton, Akibur Rahman | |
dc.date.accessioned | 2020-01-20T04:24:59Z | |
dc.date.available | 2020-01-20T04:24:59Z | |
dc.date.copyright | 2019 | |
dc.date.issued | 2019-08 | |
dc.identifier.other | ID 15301006 | |
dc.identifier.other | ID 15301037 | |
dc.identifier.other | ID 15101110 | |
dc.identifier.other | ID 15301102 | |
dc.identifier.uri | http://hdl.handle.net/10361/13632 | |
dc.description | This thesis is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science, 2019. | en_US |
dc.description | Cataloged from PDF version of thesis. | |
dc.description | Includes bibliographical references (pages 30-32). | |
dc.description.abstract | Speech is the most normal type of communication and association between people while
content (text) and images are the most basic types of exchange in the computer system.
Therefore, enthusiasm in regards to transformation between speech and text is expanding
day by day for integrating the human-computer relation. Understanding speech for a human
is not a challenge but for a machine it is a big deal because a machine does not catch
expression or human nature. For the conversion of speech into text, this proposed model
requires the usage of the open sourced framework Sphinx 4 which is written in Java. For
the proposed system, it requires certain steps which are training an acoustic model, creating
a language model and building a dictionary with CMUSphinx. For training, the audio
files were recorded by 8 speakers both male and female for more accuracy. Among them, 6
speakers recorded each word 3 times. To test the accuracy, we took audio recordings from
2 speakers among them one speaker is unknown to the system. After testing, we got the
accuracy around 59.01%. For known speakers we got 78.57% accuracy. We gave audio files
as input only to check accuracy as our main purpose was to make a system which works in
real time. In our system, user can speak in real time and the system converts it into text. | en_US |
dc.description.statementofresponsibility | Israt Jerin Bristy | |
dc.description.statementofresponsibility | Nadim Imtiaz Shakil | |
dc.description.statementofresponsibility | Tesnim Musavee | |
dc.description.statementofresponsibility | Akibur Rahman Choton | |
dc.format.extent | 32 pages | |
dc.language.iso | en | en_US |
dc.publisher | Brac University | en_US |
dc.rights | Brac University theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. | |
dc.subject | Bangla | en_US |
dc.subject | Voice recognition | en_US |
dc.subject | CMUSphinx | en_US |
dc.subject | Acoustic model | en_US |
dc.subject | Language model | en_US |
dc.subject.lcsh | Machine learning. | |
dc.title | Bangla speech to text conversion using CMU sphinx | en_US |
dc.type | Thesis | en_US |
dc.contributor.department | Department of Computer Science and Engineering, Brac University | |
dc.description.degree | B. Computer Science | |