A voice signal based gender prediction model using random forest classifier
Abstract
In the proposed model, Classification and Regression Tree (CART) was used as a classifier to classify
gender using four different algorithms which were tested with changing dataset frames, layer sizes and
samples to get best options for our model. We had to tune our dataset with Principal Component
Analyzer(PCA) which improved the accuracy rate a bit and also worked along with the algorithms. The
intelligible idea of voiceprints and human-computer interaction gave us the motivation to predict gender
by using different proposed classifiers that we are using in our model .Besides the overall efficiency and
outcome of human-computer interaction gave us the inspiration to select this model for our thesis paper.
In this existing system there are quite a lot of problem that arose while dealing with our proposed model
those are over fitting of the dataset, having different layer sizes, number of decision tree and most
importantly solving the hidden layer sizes. We did successfully solved most of the problems by running
five different algorithms on our model which are Decision Tree Classifier, Logistic Regression, Support
Vector Machine (SVM) , Multi-Layer Perceptron Classifier (MLP) and Random Forest (RF) Classifier.
To use the total dataset on this algorithm we used 75% training and 25% testing of the total dataset. Due
to different layers we had different accuracy result for each of the algorithms. The worst accuracy result
was given by Multi-Layer Perceptron (MLP) which was 75% in two implementations and the best
accuracy result was given by Random Forest Classifier which was 97.34 % from our proposed model.