RansomListener: Ransom call sound investigation using LSTM and CNN Architectures
Abstract
Getting calls for ransoms are common phenomena in kidnapping and abduction
related incidents where the life of the victim remains extremely vulnerable. These
phone calls are often analyzed in real-time by law enforcement authorities to quickly
identify the suspects and get crucial information for quick action. However, it is often difficult to manually analyze those phone calls due to the quality of sounds and
the presence of several background noises. Even with much high-end software in
their inventory, it is futile to accurately refine the incoming calls as it takes a huge
amount of time to declutter the different layers of noises in the call. This paper
proposes a model based on deep convolutional neural network and signal processing for automatic classification of crucial sounds in ransom related phone calls. We
have proposed LSTM and 2D CNN customized models and compared their outputs
with VGG16 and AlexNet. Moreover, this paper also presents a unique dataset of
different sounds in terms of voices like male or female and the environmental sounds
where the victim might be in which can be a probable clue for investigation purposes consisting of 17650 audio clips collected from verified online sources. Finally,
the models produced very high classification accuracy with the accuracy of LSTM
reaching around 93.4%.