Show simple item record

dc.contributor.authorNaushad UzZaman,
dc.contributor.authorKhan, Mumit
dc.date.accessioned2010-10-04T05:22:33Z
dc.date.available2010-10-04T05:22:33Z
dc.date.copyright2005
dc.date.issued2005
dc.identifier.urihttp://hdl.handle.net/10361/312
dc.descriptionIncludes bibliographical references (page 6).
dc.description.abstractAlmost any word can be a Bangali name, and the name in turn is often spelled in many different ways, all of which are considered correct and interchangeable. The reason for the spelling complication is two-fold: (1) there is a large gap between the script and pronunciation in Bangla, largely attributed to the large scale Sanskritization process that started in the 12th century and continued throughout the middle ages, and (2) typical Bangla names have very different origins, from the indigenous names derived primarily from Sanskrit, to the imported Muslim names from Persian and Arabic, Christian names from Portuguese, and even the names from popular Western TV soap-operas. However, there is always a large degree of phonetic similarity in the spelling variants of a name, which is the key to searching and matching names in records. We present a Double Metaphone encoding for Bangla names, taking into account the various spelling and phonetic rules in use, which can be used by applications to search for and match names. We encode the spelling variants of a large number of names found in the literature to demonstrate that the encoding does indeed show that the variants of a name are equivalent. A name searching algorithm may employ various figures of merit to narrow the list of possibilities when searching for similar names; we demonstrate one such figure of merit using name encoding and edit distance that has shown good promise.en_US
dc.format.extent6 pages
dc.language.isoenen_US
dc.publisherBRAC Universityen_US
dc.subjectName searchingen_US
dc.subjectName encoding,en_US
dc.subjectPhonetic encodingen_US
dc.subjectDouble metaphone encodingen_US
dc.subjectBangla languageen_US
dc.titleA double metaphone encoding for approximate name searching and matching in Banglaen_US
dc.typeArticleen_US
dc.contributor.departmentCenter for Research on Bangla Language Processing (CRBLP), BRAC University


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record