Comprehensive analysis and development of deep learning  models for Bengali character’s spectrogram image classification  in child speech: introduction of spectro SETNet

Ahmed, Syed Istiaque; Hossain, Md. Jubayer; Hoque, Kayes Mohammad Bin; Tusher, Mahmadur Rahman; Islam, Sajedur

dc.contributor.advisor	Alam, Md. Golam Rabiul
dc.contributor.advisor	Nayla, Nishat
dc.contributor.author	Ahmed, Syed Istiaque
dc.contributor.author	Hossain, Md. Jubayer
dc.contributor.author	Hoque, Kayes Mohammad Bin
dc.contributor.author	Tusher, Mahmadur Rahman
dc.contributor.author	Islam, Sajedur
dc.date.accessioned	2024-09-09T05:00:39Z
dc.date.available	2024-09-09T05:00:39Z
dc.date.copyright	©2024
dc.date.issued	2024-05
dc.identifier.other	ID 20101273
dc.identifier.other	ID 20101470
dc.identifier.other	ID 20101471
dc.identifier.other	ID 20101005
dc.identifier.other	ID 23141093
dc.identifier.uri	http://hdl.handle.net/10361/24029
dc.description	This thesis is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science, 2024.	en_US
dc.description	Cataloged from PDF version of thesis.
dc.description	Includes bibliographical references (pages 82-84).
dc.description.abstract	In a rapidly developing linguistic technology, the key role of phoneme recognition consists of understanding language and language learning. The research will be framed where a recognition system is developed for the language of Bangla—vowels, consonants, and numbers for children of age three to six years. By adopting ad vanced approaches like technological methods and classical phonetic education, the spectrogram images of the Bengali children we investigate are classified. Among the techniques associated with modern machine learning (ML) the pervasive techniques are image recognition and large language models (LLM) which have extended to the less explored domain of Bangla phoneme spectrogram image recognition. From our group of 21 participants, we have generated balanced 31,147 spectrogram images a new dataset that we have created from scratch. This is because the dataset was done meticulously to serve as a complete resource for researchers of Bangla-speaking children’s phoneme recognition. Therefore, we then trained ten pre-existing deep learning models that were capable of interpreting and optimizing their performance in Bangla phoneme recognition by using our dataset. Based on these, the SENet model stood out among other existing models with a high performance of 96. 89% accuracy on our testing data set. The ResNet50 and VGG19 models produced the best outcomes among the deep learning models tested which ranked second and third respectively with an accuracy of 88. 8% and 87%. Based on these findings, we propose a novel architecture, Spectrogram SE-Transformer Block Network (Spectro SETNet), which is a hybrid of the ResNet50 model to which the SE and Transformer blocks have been added, in order to cope with more complicated data and to limit the computational power. The original hypothesis is that the model not only im proves the accuracy of Bengali speech recognition for children but also offers a new standard for more complex data processing with less computational power.	en_US
dc.format.extent	84 pages
dc.language.iso	en	en_US
dc.publisher	Brac University
dc.rights	Brac University theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission.
dc.subject	Automatic speech recognition	en_US
dc.subject	Character’s recognition	en_US
dc.subject	Deep learning	en_US
dc.subject	Mel-frequency spectrogram	en_US
dc.subject	Spectro-SETNet	en_US
dc.subject.lcsh	Automatic speech recognition--Data processing.
dc.subject.lcsh	Deep learning (Machine learning).
dc.subject.lcsh	Spectrometer--Data processing.
dc.title	Comprehensive analysis and development of deep learning models for Bengali character’s spectrogram image classification in child speech: introduction of spectro SETNet	en_US
dc.type	Thesis	en_US
dc.contributor.department	Department of Computer Science and Engineering, Brac University
dc.description.degree	B.Sc in Computer Science

Files in this item

Name:: 20101273, 20101470, 20101471, ...
Size:: 1.614Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Thesis & Report, BSc (Computer Science and Engineering) [1480]

Show simple item record