Recognition of Bangladeshi sign language from 2D videos using openpose and LSTM based RNN
Abstract
Sign-language recognition is an essential part of computer vision to solve a communication obstacle between the deaf-mute and the common. Bangladeshi Sign Language (BdSL) is the medium of communication of the deaf and dumb community of Bangladesh. Where 2.4 million people cannot communicate without a sign language, developing countries like Bangladesh do not have sufficient facilities for these people [34]. Our research represents a sign-language recognizer in Bangladesh which is an approach to understanding Bangladeshi sign language so that it can become a bridge between the deaf-mute community and the normal world. Though many works have been done in this field for foreign languages, there are only a few remarkable works on the Bangladeshi Sign Language, among which they used techniques that are not accessible to all, and their accuracy was also not satisfactory. Moreover, there is a shortage of publicly available datasets of Bangladeshi Sign Language. Our objective is to deliver a compact and highly accurate system that will recognize Bangladeshi Sign Language. We propose an method based on estimation of the human keypoints. First of all, we develop a BdSL dataset containing 1151 videos with ten different words. Our algorithm uses OpenPose to extract human pose from 2D videos and feed the extracted features keeping their temporal nature to an LSTM based RNN classifier that accurately classifies the signs. Our proposed sign language model classifies the signs of Bangladeshi Sign Language with 96.54% accuracy.