A multimodal approach of sentiment analysis to predict customer emotions towards a product
MetadataShow full item record
Begun roughly a decade ago, Sentiment analysis, the science of understanding human emotions, can trace its roots back to the middle of the 19th century. The motive of sentiment analysis is to extract and predict human emotions through facial expressions, speech or even text in some cases. Being inspired by the existing ideas, we propose a multi-modal model in the market that uses both facial cue and speech to forecast customers’ sentiments and satisfaction towards a certain product. Our model helps various companies get key insights for specific market regions and their customers, and to gain a competitive advantage over the other. In this study, we estimate product perception of a demography based on emotions that were extracted from customers’ facial expressions and speech. Although many researches have been made in the eld, but very few of them are multifaceted, integrated systems, where the different components rely on each other to produce an absolute result. We extract the emotions of people by recording their facial cues and speech patterns as they interact with a specific product of the market, such as a mobile phone in our case. We analyzed their facial expressions using AWS Rekognition. For the textual part, we analyze the sentiments using an algorithm which has a mixture of Tensorflow, Keras, Sequential model and RNN. Finally, we merge the previously obtained emotions from the video section with the textual sentiment to get the features for our predictive model. The model was generated using an algorithm named XGBoost. We have achieved an average accuracy of 81 percent approximately with 0.065 standard deviation by implementing cross validation of k-fold nature with folds of 3 and also 5 different iterations.