Multiclass emotion classification by using Spectrogram image analysis: A CNN-XGBoost fusion approach
Abstract
In recent times, AI based emotion recognition also known as affective-computing
which is a burgeoning branch of artificial intelligence to some extent also helping
the computers to become more intelligent by scrutinizing the non-verbal signals
or sentiments of humans. An essential component of human-computer interaction
is emotion recognition, which has attracted a lot of interest recently because of
its potential uses in a variety of industries, including psychology, business, neuro marketing strategy, education, and entertainment. In this research, we propose a
combination of Convolutional Neural Networks (CNNs) and XGBoost algorithms on
Electroencephalogram (EEG) spectrogram images to propose an intriguing fusion based model for identify four different classsed emotion, namely happy, sad, fear, and
neutral. It has been researched that EEG signals hold important information about
emotional states, and spectrogram images offer a good way to visualize this informa tion. Before feeding the spectrogram images into the CNN-XGBoost model, Before
transforming the EEG data to RGB pictures, we first use a Short-Time Fourier
Transform (STFT). The XGBoost method is utilized for multiclass classification,
while the CNN retrieves pertinent features from the spectrogram images. On our
benchmark dataset called SEED-IV dataset which is publically accessible dataset
for EEG-based emotion identification, our proposed approach was validated and it
exhibited top-of-the-line precision and F1-score results. To do this, we extracted
features from the signals using a range of feature extraction approaches, includ ing the Short Time Fourier Transformation, Discrete Cosine Transformation, Power
Spectral Density, Differential Entropy factors, and certain statistical traits. In order
to demonstrate that the model we suggest is better in terms of accuracy and com puting efficiency, we also conducted comparisons with a number of other well-known
models. The performance analysis demonstrates that the suggested CNN-XGBoost
fusion approach, which is based on spectrogram images, excels over conventional
feature-based CNN, LSTM, and various pretrained models, including the VGG16
and VGG19 methods. Our stated CNN-XGBoost fusion-based framework using
EEG spectrogram images delivers a promising method for precise and effective mul ticlass emotion identification, which has significant ramifications for facilitating the
development of future systems for human-computer interaction.