Automatic subtitle generation for Bengali multimedia using deep learning

Rhythm, Ehsanur Rahman; Arnob, Shafakat Sowroar; Shuvo, Rajvir Ahmed

View/Open

22241163_20101129_20141003_CSE.pdf (3.419Mb)

Date

2023-09

Publisher

Brac University

Abstract

For audio or video material to be more inclusive and accessible, automatic subtitle generation is essential. Nevertheless, implementing this technology into Bengali presents significant challenges due to scarce resources and linguistic difficulty. In this study, a new deep learning based system for creating Subtitles for Bengali multimedia automatically is introduced. The suggested approach makes use of the Wav2vec2 and the Common Voice Bengali Dataset, a large collection of Bengali audio recordings. This study uses the Common Voice Dataset Bengali to train and tune the Wav2vec2 model in order to accurately convert Bengali audio into text. Current automatic speech recognition approaches are combined with Bengali language-specific factors in the created system to give accurate and reliable transcription works. The transcribed text is synced with the matching audio parts throughout the subtitle production process. The produced subtitles are enhanced using post-processing approaches, similar to capitalization and punctuation restoration, to ensure readability and consistency. The findings of this study might greatly improve Bengali language media’s usability and availability across a range of sectors. The created subtitles may enhance the watching experience for Bengali multimedia by easing greater understanding, and expanding availability. The study demonstrates the potential of using deep learning and ASR methods to get over the difficulties of automated subtitle production in the Bengali language, advancing multimedia availability and inclusion.

Keywords

Automatic subtitle generation; Bengali audio; Deep learning; Natural language processing

LC Subject Headings

Natural language processing (Computer science); Computational linguistics; Data mining

Description

This thesis is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science and Engineering, 2023.

Cataloged from PDF version of thesis.

Includes bibliographical references (pages 51-53).

Department

Department of Computer Science and Engineering, Brac University

Type

Thesis

Collections

Thesis & Report, BSc (Computer Science and Engineering) [1583]