Personal information from Bangla speech signal using MFCC and GMM

Hridy, Maisha Munawara; Hasan, Md. Hasib; Emon, Mahfuz Al

View/Open

14101037, 14101033, 14101007_CSE.pdf (481.7Kb)

Date

2019-08

Publisher

Brac University

Abstract

Our system extracts personal information from bangla speech. Dataset that was used consists real-life voice inputs from di erent age and gender groups. A set of Bengali speech samples from YouTube were used as input dataset. This system is based on basic machine learning algorithms. Mel frequency cepstral coe cient was used to train and construct this system. While calculating gender and age detection part, we will be using GMM to calculate the nal scores on the samples having the MFCCs of the extracted speech samples. GMM model basically congregates some subsets among the whole set based on probability. Along with the gender determination process, age detection process will also be simulated using fundamental frequency of speech. Python is the programming language used to write the coding. Our system was successful in giving 88% accuracy for gender recognition and 75% accuracy for age detection.

Keywords

Mel Frequency Cepstral Coe cient; Gaussian Mixture Model; Natural language processing; Python; Bangla

LC Subject Headings

Machine learning.

Description

This thesis is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science and Engineering, 2019.

Cataloged from PDF version of thesis.

Includes bibliographical references (pages 18-20).

Department

Department of Computer Science and Engineering, Brac University

Type

Thesis

Collections

Thesis & Report, BSc (Computer Science and Engineering) [1480]