A light weight stemmer for Bengali and its use in spelling checker

Islam, Md. Zahurul; Uddin, Md. Nizam; Khan, Mumit

View/Open

A Light Weight Stemmer for Bengali and Its.pdf (317.7Kb)

Date

2007

Publisher

BRAC University

Abstract

Stemming is an operation that splits a word into the constituent root part and affix without doing complete morphological analysis. It is used to improve the performance of spelling checkers and information retrieval applications, where morphological analysis would be too computationally expensive. For spelling checkers specifically, using stemming may drastically reduce the dictionary size, often a bottleneck for mobile and embedded devices. This paper presents a computationally inexpensive stemming algorithm for Bengali, which handles suffix removal in a domain independent way. The evaluation of the proposed algorithm in a Bengali spelling checker indicates that it can be effectively used in information retrieval applications in general.

Keywords

Bengali spelling checker

Description

Includes bibliographical references (page 6).

Department

Center for Research on Bangla Language Processing (CRBLP), BRAC University

Type

Article

Collections

Conference Papers (Centre for Research on Bangla Language Processing) [40]