Urban sound classification using convolutional Neural Network and long short term memory based on multiple features

Das, Joy Krishan; Ghosh, Arka; Pal, Abhijit Kumar; Dutta, Sumit

dc.contributor.advisor	Chakraborty, Amitabha
dc.contributor.author	Das, Joy Krishan
dc.contributor.author	Ghosh, Arka
dc.contributor.author	Pal, Abhijit Kumar
dc.contributor.author	Dutta, Sumit
dc.date.accessioned	2021-05-29T10:04:59Z
dc.date.available	2021-05-29T10:04:59Z
dc.date.copyright	2020
dc.date.issued	2020-04
dc.identifier.other	ID 17301218
dc.identifier.other	ID 16201007
dc.identifier.other	ID 16301148
dc.identifier.other	ID 16301104
dc.identifier.uri	http://dspace.bracu.ac.bd/xmlui/handle/10361/14444
dc.description	This thesis is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science and Engineering, 2020.	en_US
dc.description	Cataloged from PDF version of thesis.
dc.description	Includes bibliographical references (pages 43-46).
dc.description.abstract	There are many sounds all around us and our brain can easily and clearly identify them. Furthermore, our brain processes the received sound signals continuously and provides us with relevant environmental knowledge. Although not up to the level of accuracy of the brain, there are some smart devices which can extract necessary information from an audio signal, with the help of di erent algorithms. And as the days pass by more, more research is being conducted to ensure that accuracy level of this information extraction increases. Over the years several models like the CNN, ANN, RCNN and many machine learning techniques have been adopted to classify sound accurately and these have shown promising results in the recent years in distinguishing spectra- temporal pictures. For our research purpose, we are using seven features which are Chromagram, Mel-spectrogram, Spectral contrast, Tonnetz, MFCC, Chroma CENS and Chroma cqt.We have employed two models for the classi cation process of audio signals which are LSTM and CNN and the dataset used for the research is the UrbanSound8K. The novelty of the research lies in showing that the LSTM shows a better result in classi cation accuracy compared to CNN, when the MFCC feature is used. Furthermore, we have augmented the UrbanSound8K dataset to ensure that the accuracy of the LSTM is higher than the CNN in case of both the original dataset as well as the augmented one. Moreover, we have tested the accuracy of the models based on the features used. This has been done by using each of the features separately on each of the models, in addition to the two forms of feature stacking that we have performed. The rst form of feature stacking contains the features Chromagram, Mel-spectrogram, Spectral contrast, Tonnetz, MFCC, while the second form of feature stacking contains MFCC, Melspectrogram, Chroma cqt and Chroma stft. Likewise, we have stacked features using di erent combinations to expand our research.In such a way it was possible, with our LSTM model, to reach an accuracy of 98.80%, which is state-of-the-art performance.	en_US
dc.description.statementofresponsibility	Joy Krishan Das
dc.description.statementofresponsibility	Arka Ghosh
dc.description.statementofresponsibility	Abhijit Kumar Pal
dc.description.statementofresponsibility	Sumit Dutta
dc.format.extent	47 pages
dc.language.iso	en	en_US
dc.publisher	Brac University	en_US
dc.rights	Brac University theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission.
dc.subject	Sound classi cation	en_US
dc.subject	Spectrograms	en_US
dc.subject	Urbansound8k	en_US
dc.subject	CNN	en_US
dc.subject	LSTM	en_US
dc.subject	LibROSA	en_US
dc.subject.lcsh	Neural networks (Computer science)
dc.title	Urban sound classification using convolutional Neural Network and long short term memory based on multiple features	en_US
dc.type	Thesis	en_US
dc.contributor.department	Department of Computer Science and Engineering, Brac University
dc.description.degree	B. Computer Science

Files in this item

Name:: 17301218, 16201007, 16301148, ...
Size:: 3.561Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Thesis & Report, BSc (Computer Science and Engineering) [1486]

Show simple item record