dc.contributor.advisor | Mostakim, Moin | |
dc.contributor.author | Rashid, Warida | |
dc.contributor.author | Reza, Mohi | |
dc.date.accessioned | 2018-01-15T05:23:18Z | |
dc.date.available | 2018-01-15T05:23:18Z | |
dc.date.copyright | 2017 | |
dc.date.issued | 2017 | |
dc.identifier.other | ID 14301026 | |
dc.identifier.other | ID 14101040 | |
dc.identifier.uri | http://hdl.handle.net/10361/9059 | |
dc.description | This thesis report is submitted in partial fulfilment of the requirements for the degree of Bachelor of Science in Computer Science and Engineering, 2017. | en_US |
dc.description | Cataloged from PDF version of thesis report. | |
dc.description | Includes bibliographical references (pages 31-33). | |
dc.description.abstract | We have created an isolated-word dataset - Prodorshok 1, which consists of 34 Bengali words related to navigation with 1011 voice samples. The word set is intended to help design speaker dependent/independent, voice-command driven automated speech recognition (ASR) systems that can potentially improve human-computer interaction. This paper presents the results of an objective analysis that was undertaken using a subset of words from Prodorshok I to help assess its reliability in ASR systems that utilize Hidden Markov Models (HMM) with Gaussian emissions and Deep Neural Networks (DNN). The results show that simple data augmentation involving a small pitch shift can make surprisingly tangible improvements to accuracy levels in speech recognition, even when working with small datasets. Prodorshok I will be expanded upon and made publicly available for others to use under an Open Data License (ODbL). | en_US |
dc.description.statementofresponsibility | Warida Rashid | |
dc.description.statementofresponsibility | Mohi Reza | |
dc.format.extent | 34 pages | |
dc.language.iso | en | en_US |
dc.publisher | BRAC Univeristy | en_US |
dc.rights | BRAC University theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. | |
dc.subject | Data augmentation | en_US |
dc.subject | Speech recognition | en_US |
dc.title | Bengali isolated speech recognition : a comparative analysis of the effects of data augmentation on HMM and DNN based acoustic models | en_US |
dc.type | Thesis | en_US |
dc.contributor.department | Department of Computer Science and Engineering, BRAC University | |
dc.description.degree | B. Computer Science and Engineering | |