dc.contributor.advisor | Rasel, Annajiat Alim | |
dc.contributor.author | Karimi, Sadullah | |
dc.date.accessioned | 2022-10-24T10:37:55Z | |
dc.date.available | 2022-10-24T10:37:55Z | |
dc.date.copyright | 2022 | |
dc.date.issued | 2022-09 | |
dc.identifier.other | ID: 21166041 | |
dc.identifier.uri | http://hdl.handle.net/10361/17532 | |
dc.description | This thesis is submitted in partial fulfillment of the requirements for the degree of Master of Science in Computer Science and Engineering, 2022. | en_US |
dc.description | Cataloged from PDF version of thesis. | |
dc.description | Includes bibliographical references (pages 167-194). | |
dc.description.abstract | Technology adoption is extremely limited in Afghanistan, especially since people have limited access to the
Internet, smartphone, and computer due to power limitations and the high cost of the Internet. The people
in Afghanistan suffer from high-cost of Internet that is provided by the private sector with very low-speed and
quality. Natural Language Processing (NLP) has various applications and improves access to information and
systems. To advance as a country, Afghanistan needs to be able to utilize existing databases, datasets, and
create new ones and maintain those. Initially, people need a system so they can access the databases providing
various guidance with the limited resource that they have access to. Later, they would benefit from higher
level access for maintenance and crowdsourced contributions. This work first focus on building a system that
Afghanistan people can access database in their native language. Afghan (Dari) language is one of the widely
used languages, with up to 110 million speakers worldwide. It is used in countries like Afghanistan, Azerbaijan,
Iran, Iraq, Russia, Tajikistan, Turkmenistan, Uzbekistan, etc. The Afghan language lacks resources and requires
more qualified lexicon translation. The proposed Afghan Natural Language Interface to Database is based on
a natural language query-response model. Afghan language has been used in the model to extract desired
data from a database. Retrieving data from a database necessitates knowledge of SQL Query Language or a
very well-designed user interface. It is easy for domain experts to retrieve data from databases. However, it
is quite challenging for non-expert users to access the database using SQL queries in absence of a proper and
friendly user interface. This work overcomes the challenge for those who speak the Afghan Language worldwide
to access different databases and datasets. First, we did a survey of current state of Afghan NLP for finding
research gaps for future researchers of the Afghan language. We have identified the research gap of NLIDB
systems. Second, we surveyed non-English NLIDB systems and conducted a systematic review of the current
methods of non-English NLIDB. Then we propose an NLIDB system for Afghan language. Through our system,
users in Afghanistan can access the database through feature phone, land phone calls based on an open-source
Interactive Voice Response (IVR) system in addition to smartphones and computers. The system can be easily
accessed by users without the need for high-speed Internet, sustainable power, computer, and smartphone to
access databases. The system is built according to the limited technology situation in Afghanistan. The Afghan
Spoken NLIDB build through lexical analysis, semantic analysis, and syntax analysis to respond to the Afghan
language natural language query for transforming it into Structured Query Language (SQL). | en_US |
dc.description.statementofresponsibility | Sadullah Karimi | |
dc.format.extent | 194 Pages | |
dc.language.iso | en_US | en_US |
dc.publisher | Brac University | en_US |
dc.rights | Brac University theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. | |
dc.subject | Natural Language Querying | en_US |
dc.subject | Translating From Afghan to English | en_US |
dc.subject | Lexical analysis | en_US |
dc.subject | Syntax analysis | en_US |
dc.subject | Semantic analysis | en_US |
dc.subject | Query Generation | en_US |
dc.subject | Python Library | en_US |
dc.subject | Data dictionary | en_US |
dc.subject | Natural language interface to database | en_US |
dc.subject | NLIDB | en_US |
dc.subject | Non-English NLIDB | en_US |
dc.subject | Natural language interface | en_US |
dc.subject | NLI | en_US |
dc.subject | Natural language user interface | en_US |
dc.subject | NLUI | en_US |
dc.subject | Afghan NLP survey | en_US |
dc.subject | Dari | en_US |
dc.subject.lcsh | Dari language. | |
dc.title | Survey of Afghan (Dari) Language NLP for Building Afghan NLIDB System | en_US |
dc.type | Thesis | en_US |
dc.contributor.department | Department of Computer Science and Engineering, Brac University | |
dc.description.degree | M. Computer Science and Engineering | |