Early threat warning via speech and emotion recognition from voice calls

Ishtiak, Ifaz; Rahman, Mohammad Mazedur; Usmani, Md.Razaul Haque

dc.contributor.advisor	Arif, Hossain
dc.contributor.author	Ishtiak, Ifaz
dc.contributor.author	Rahman, Mohammad Mazedur
dc.contributor.author	Usmani, Md.Razaul Haque
dc.date.accessioned	2019-02-14T05:37:49Z
dc.date.available	2019-02-14T05:37:49Z
dc.date.copyright	2018
dc.date.issued	2018-12
dc.identifier.other	ID 15101118
dc.identifier.other	ID 15101043
dc.identifier.other	ID 14241005
dc.identifier.uri	http://hdl.handle.net/10361/11412
dc.description	This thesis is submitted in partial fulfilment of the requirements for the degree of Bachelor of Science in Computer Science and Engineering, 2018.	en_US
dc.description	Includes bibliographical references (pages 53-56).
dc.description	Cataloged from PDF version of thesis.
dc.description.abstract	The aim of this system is to identify potential cases of threats, and provide an early warning or alert to such cases. This will be based on voice such as voice chat over telecommunication networks or social media. The intended result will be achieved in three major steps. At first, the conversion of speech to text from both real time audio recordings and from accent groups will be applied using primarily IBM Watson’s Speech to Text. This will then be used to identify possible trigger words or word patterns from a classified selection of threat-related and negative words. And finally, the same audio source will be utilized for detecting emotions from the frequency shifts through vocal feature extraction from audio input and processing it using multiple classifier algorithms such as Support Vector Machines (SVMs), Random Forests and Naïve Bayes. Libraries such as LibROSA will be applied to extract primary audio features such as Mel Frequency Cepstral Coefficients (MFCC) to generate accurate predictions. The system yields a result of approximately 84% using the SVM RBF (Radial Basis Function) kernel, which highlights the accuracy of emotion detected based on the speech. Keywords— Emotion Recognition; Support Vector Machines; Speech to Text; Random Forest; Feature Extraction; MFCC	en_US
dc.description.statementofresponsibility	Ifaz Ishtiak
dc.description.statementofresponsibility	Rahman, Mohammad Mazedur
dc.description.statementofresponsibility	Md.Razaul Haque Usmani
dc.format.extent	63 pages
dc.language.iso	en	en_US
dc.publisher	BRAC University	en_US
dc.rights	BRAC University theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission.
dc.subject	Emotion recognition	en_US
dc.subject	Vector machines	en_US
dc.subject	Speech to Text	en_US
dc.subject	Random forest	en_US
dc.subject	Feature extraction	en_US
dc.subject	MFCC	en_US
dc.subject.lcsh	Human-computer interaction.
dc.subject.lcsh	Artificial intelligence.
dc.subject.lcsh	Emotions -- Computer simulation.
dc.title	Early threat warning via speech and emotion recognition from voice calls	en_US
dc.type	Thesis	en_US
dc.contributor.department	Department of Computer Science and Engineering, BRAC University
dc.description.degree	B. Computer Science and Engineering

Files in this item

Name:: 15101118,14241005,15101043_CSE.pdf
Size:: 18.63Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Thesis & Report, BSc (Computer Science and Engineering) [1589]

Show simple item record