Explainable Deepfake video detection using Convolutional Neural Network and CapsuleNet
Abstract
The term ‘Deepfake’ comes from the deep learning technology. Deepfake technology can
easily and smoothly stitch anyone into any digital media where they never took part in
reality. The key components of deepfake are machine learning and Artificial Intelligence (AI).
At the beginning deepfake was introduced for research, industrial uses and entertainment
purposes. The capabilities of deepfakes have existed for decades but the creations were not
as realistic as they are today. As time passes, deepfakes are also improving and creating such
things which are hard to identify as ‘real’ or as ‘fake’ with bare eyes. Furthermore, the new
technologies now allow anyone to make deepfakes even if the creator is unskilled. The ease
of accessibility and the increase of availability of deepfake creations have raised the issue of
security.The most highly used algorithm to make these deepfake videos is GAN (generative
Adversarial network), which is basically a machine learning algorithm which creates a fake
image and discriminates itself to reproduce the best possible Fake frame or video. Our
Primary goal is to use CNN (Convolutional Neural Network) and CapsuleNet with LSTM to
distinguish which frame of the video was generated by the deepfake algorithm and which was
the original one. We also want to find out why our model predicted the output of detection
and analyze the patterns using Explainable AI. We want to apply this approach to develop
a transparent relation between AI and human agents and also to set an applicable example
of explainable Ai in real-life-scenarios.