Conference Papers (Centre for Research on Bangla Language Processing)http://hdl.handle.net/10361/1022024-03-29T08:17:36Z2024-03-29T08:17:36ZDetecting flames and insults in textMahmud, AltafAhmed, Kazi ZubairKhan, Mumithttp://hdl.handle.net/10361/7142019-09-29T05:27:33Z2008-12-01T00:00:00ZDetecting flames and insults in text
Mahmud, Altaf; Ahmed, Kazi Zubair; Khan, Mumit
While the internet has become the leading source of information, it is also become the medium for flames, insults and other forms of abusive language, which add nothing to the quality of information available. A human reader can easily distinguish between what is information and what is a flame or any other form of abuse. It is however much more difficult for a language processor to do this automatically. This paper describes a new approach for an automated system to distinguish between information and personal attacks containing insulting or abusive expressions in a given document. In linguistics, insulting or abusive messages are viewed as an extreme subset of the subjective language because of its extreme nature. We create a set of rules to extract the semantic information of a given sentence from the general semantic structure of that sentence to separate information from abusive language.
Includes bibliographical references (page 10).
2008-12-01T00:00:00ZText to speech for Bangla language using festivalAlam, FirojNath, Promila KantiKhan, Mumithttp://hdl.handle.net/10361/6752019-09-29T05:27:47Z2007-01-01T00:00:00ZText to speech for Bangla language using festival
Alam, Firoj; Nath, Promila Kanti; Khan, Mumit
In this paper, we present a Text to Speech (TTS) synthesis system for Bangla language using the open-source Festival TTS engine. Festival is a complete TTS synthesis system, with components supporting front-end processing of the input text, language modeling, and speech synthesis using its signal processing module. The Bangla TTS system proposed here, creates the voice data for festival, and additionally extends festival using its embedded scheme
scripting interface to incorporate Bangla language support. Festival is a oncatenative TTS system using diphone or other unit selection speech units. Our TTS implementation uses two different kinds of these concatenative methods supported in Festival: unit selection and multisyn unit selection. The function of a Text-to-Speech system is to convert some language
text into its spoken equivalent by a series of modules. These modules, constituting the TTS system are described in detail which is very much helpful for future development. Finally, the quality of synthesized speech is assessed in terms of acceptability and intelligibility.
Includes bibliographical references (page 6-7).
2007-01-01T00:00:00ZCollaborative lexicon development for BanglaPavel, Dewan Shahriar HossainSarkar, Asif IqbalShah, Faisal MuhammadKhan, Mumithttp://hdl.handle.net/10361/6742019-09-29T05:27:45Z2006-01-01T00:00:00ZCollaborative lexicon development for Bangla
Pavel, Dewan Shahriar Hossain; Sarkar, Asif Iqbal; Shah, Faisal Muhammad; Khan, Mumit
This paper addresses the issue of building a Bangla lexicon with a collaborative effort through stand alone application and web based interface. The words in the lexicon will be annotated with a combination of tags addressing Parts-of-speech, syntactic, semantic and other grammatical features. Bangla words have been classified into several different parts – of – speech categories including various major word groups and subgroups. This paper aims
to provide an integrated user – friendly software interface to the user to annotate a large existing Bangla word set and proposes a mechanism to collaboratively integrate linguists and other interested people into the lexicon build up process. The effort will be a significant progress towards development of a properly annotated lexicon. The outcome of the effort will significantly help in the processes of Morphological Analysis, Automatic grammar Extraction
and machine translation for Bangla.
Includes bibliographical references (page 7).
2006-01-01T00:00:00ZA proposed automated extraction procedure of Bangla text for corpus creation in unicodePavel, Dewan Shahriar HossainSarkar, Asif IqbalKhan, Mumithttp://hdl.handle.net/10361/6722019-09-29T05:27:43Z2006-01-01T00:00:00ZA proposed automated extraction procedure of Bangla text for corpus creation in unicode
Pavel, Dewan Shahriar Hossain; Sarkar, Asif Iqbal; Khan, Mumit
This paper addresses the issue of automated Bangla corpus creation, which will significantly help the processes of lexicon development, morphological analysis, automatic parts of speech detection and automatic grammar extraction and machine translation. The plan is to collect all free Bangla documents on the world wide web and offline documents available and extract all the words in them to make a huge repository of text. This body of text or corpus
will be used for several purposes of Bangla language processing after it is converted to Unicode text. The conversion process is also one of the associated and equally important research and development issue. Among several
procedures our research focuses on a combination of font and language detection and Unicode conversion of retrieved
Bangla text as a solution for automatic Bangla corpus creation and the methodology has been described in the paper.
Includes bibliographical references (page 5).
2006-01-01T00:00:00Z