Speech command classification based on deep neural networks

Hossain, Md. Sakib; Islam, Syed Tamzidul; Mazumder, Sujat; Joy, Ali Imran; Sakib, Md. Sadman

dc.contributor.advisor	Huq, Aminul
dc.contributor.advisor	Rahman, Rafeed
dc.contributor.author	Hossain, Md. Sakib
dc.contributor.author	Islam, Syed Tamzidul
dc.contributor.author	Mazumder, Sujat
dc.contributor.author	Joy, Ali Imran
dc.contributor.author	Sakib, Md. Sadman
dc.date.accessioned	2023-08-20T06:02:54Z
dc.date.available	2023-08-20T06:02:54Z
dc.date.copyright	2023
dc.date.issued	2023-03
dc.identifier.other	ID 18101201
dc.identifier.other	ID 22241133
dc.identifier.other	ID 18101300
dc.identifier.other	ID 18301179
dc.identifier.other	ID 18301061
dc.identifier.uri	http://hdl.handle.net/10361/19457
dc.description	This thesis is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science, 2023.	en_US
dc.description	Cataloged from PDF version of thesis.
dc.description	Includes bibliographical references (pages 37-38).
dc.description.abstract	In our day-to-day life there are lots of sounds that we are processing. To process these sounds our brain absorb sound signals and provide us informative knowledge. For human being this is not possible to extract every sounds properly so that, there are lots of equipment which helps us to extract essential information from an audio source. Around the year lots of model came to help thorough extract informations using various algorithms. Also, some models are Convolutional Neural Network (CNN), Region-Convolutional Neural Network (R-CNN), Artificial Neural Network (ANN), VGG16, ResNet50 and Numerous machine learning algorithms have been utilized to effectively categorize audio, and these methods have recently demonstrated encouraging results in separating spectrotemporal images from various sound classifications. The study purpose of this research was to analyze which feature extraction method shows maximum result using Convolutional Neural Network (CNN), VGG16 and ResNet50. In the proposed model, MFCC feature extraction method are taken from the dataset and trained using a multiple layer-based con volution neural network. In the experimental assessment, a sound dataset consisting of 105829 audio clips separated up into multiple groups of important sounds during study used to develop the models. Additionally, we evaluated the models’ validity which reach an accuracy of 94.53% on Speech Command dataset.	en_US
dc.description.statementofresponsibility	Md. Sakib Hossain
dc.description.statementofresponsibility	Syed Tamzidul Islam
dc.description.statementofresponsibility	Sujat Mazumder
dc.description.statementofresponsibility	Ali Imran Joy
dc.description.statementofresponsibility	Md. Sadman Sakib
dc.format.extent	49 pages
dc.language.iso	en	en_US
dc.publisher	Brac University	en_US
dc.rights	Brac University theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission.
dc.subject	Sound classification	en_US
dc.subject	Spectrograms	en_US
dc.subject	Speech command	en_US
dc.subject	CNN	en_US
dc.subject	ResNet50	en_US
dc.subject.lcsh	Neural networks (Computer science)
dc.title	Speech command classification based on deep neural networks	en_US
dc.type	Thesis	en_US
dc.contributor.department	Department of Computer Science and Engineering, Brac University
dc.description.degree	B. Computer Science

Files in this item

Name:: 18101201, 18301179, 22241133, ...
Size:: 977.9Kb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Thesis & Report, BSc (Computer Science and Engineering) [1486]

Show simple item record