Text normalization system for Bangla
Abstract
This paper describes a process of text normalization system of Bangla language (exonym: Bengali) by identifying the
semiotic classes from Bangla text corpus. After identifying the semiotic classes a set of rules were written for tokenization
and verbalization. This study is important for Text-To-Speech (TTS) system and as well as in language model for speech recognition.