Email classification and meeting scheduling  using classifier algorithm

Khan, Behroz Newaz; Saroar, Sk Golam; Alam, Md. Mosfaiul; Gomes, Sebastian Romy

View/Open

12101023, 13101251, 13101047 & 13101058_CSE.pdf (4.068Mb)

Date

2017-04

Publisher

BRAC University

Abstract

This research investigates a comparison between two different approaches for classifying emails based on their categories. Naive Bayes and Hidden Markov Model (HMM), two different machine learning algorithms, both have been used for detecting whether an email is important or spam. Naive Bayes Classifier is based on conditional probabilities. It is fast and works great with small dataset. It considers independent words as a feature. HMM is a generative, probabilistic model that provides us with distribution over the sequences of observations. HMMs can handle inputs of variable length and help programs come to the most likely decision, based on both previous decisions and current data. Various combinations of NLP techniques- stopwords removing, stemming, lemmatizing have been tried on both the algorithms to inspect the differences in accuracy as well as to find the best method among them. Along with classifying emails, this paper also describes the methodologies used for automatic meeting scheduling by an intelligent email assistant. Users who regularly send or receive messages for setting up meetings will be greatly benefitted by this system as it will classify their emails and schedule their meetings automatically.

Keywords

Email classification; Meeting scheduling; Classifier algorithm

Description

This thesis report is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science and Engineering, 2017.

Cataloged from PDF version of thesis report.

Includes bibliographical references (page 71-74).

Department

Department of Computer Science and Engineering, BRAC University

Type

Thesis

Collections

Thesis & Report, BSc (Computer Science and Engineering) [1400]