Gender classification in Bangla language using deep learning-based voice analysis
Date
2023Publisher
Brac UniversityAuthor
Hakim, Talukder JuhaerMonsur, Sayema Binte
Shuvo, Abtahi Maskawath
Azrine, Tasmia
Labib, Md. Zarif
Metadata
Show full item recordAbstract
Gender classification based on voice analysis is one of the elemental tasks in speech
and audio processing, with various applications such as speech recognition systems,
voice assistants, call center analytics, etc. For speech synthesis, speaker identification,
and human-computer interaction- gender recognition plays a vital role. Although
extensive research on this topic has been done in various languages, any
studies can hardly be found regarding gender classification in the Bangla language.
Our research paper aims to recognize gender in the Bangla language using deep
learning approaches and voice analysis. The core of our approach involves the use of
CNN models (ResNet50, EfficientNetB0, InceptionV3, and DenseNet-121) for our
data training. The Mel-Frequency Cepstral Coefficients (MFCC) and short-time
Fourier transforms (STFT) were computed from audio recordings and used as input
features to the neural network model. The system’s excellent accuracy rate
demonstrates its potential for use in practical settings. By providing light on the
application of deep learning techniques in the context of the Bangla language, this
study advances the area of gender identification. 95% accuracy was achieved in the
InspectionV3 and EfficientNetB0 models with the MFCC input.