Predicting diabetes using machine learning: a comparative study of supervised classification models

Pushpo, Mahzebin

dc.contributor.advisor	Islam, Mohammad Rafiqul
dc.contributor.author	Pushpo, Mahzebin
dc.date.accessioned	2024-08-20T05:12:05Z
dc.date.available	2024-08-20T05:12:05Z
dc.date.copyright	2023
dc.date.issued	2023
dc.identifier.other	ID 19216002
dc.identifier.uri	http://hdl.handle.net/10361/23819
dc.description	This thesis is submitted in partial fulfilment of the requirements for the degree of Bachelor of Science in Mathematics 2023.	en_US
dc.description	Catalogued from the PDF version of the thesis.
dc.description	Includes bibliographical references (pages 77-82).
dc.description.abstract	Diabetes is a primary worldwide health concern that can develop at any age and has serious consequences. It results from imbalanced glucose levels in the body. As well as being a long-term disease, it has other associated risks, from life-threatening problems to financial loss. So, it is essential to correctly detect this condition as soon as possible to mitigate further complications. Due to developments in medical technology, many tools are available today for diagnosing diseases. To ensure faster predictions and diagnosis of patients, one such tool known as machine learning (ML) algorithms is used. It is a section of Artificial Intelligence (AI) that replicates a human's learning process to train a system. In this study, the algorithms used to predict diabetes patients are supervised classification ML algorithms like Logistic Regression, K-Nearest Neighbor, Naïve Bayes, Decision Tree, and Random Forest. The data used is primary data, which is collected from Bangladeshi adults from different age groups. It consists of all the demographic data, medical history, and family information necessary for the study. The dataset is collected and cleaned for repetition and errors. From these data, diabetes status is taken as the dependent variable, and the associated risk factors are the independent variable. Then, the model is deployed using the RapidMiner tool. The confusion matrices for each model are also produced, and a comparative analysis is carried out. After evaluating their performances, the highest accuracy achieved was 94.62% and 94.23%. From these findings, the best model can be determined. This selection of the ideal model is useful because it will help in the proper and timely identification of patients in the future in the healthcare sector so that treatment can be done to curb the disease.	en_US
dc.description.statementofresponsibility	Mahzebin Pushpo
dc.format.extent	84 pages
dc.language.iso	en	en_US
dc.publisher	Brac University	en_US
dc.rights	Brac University theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission.
dc.subject	Machine learning	en_US
dc.subject	Logistic regression	en_US
dc.subject.lcsh	Diabetes -- Data processing.
dc.subject.lcsh	Diabetes -- Prediction.
dc.title	Predicting diabetes using machine learning: a comparative study of supervised classification models	en_US
dc.type	Thesis	en_US
dc.contributor.department	Department of Mathematics and Natural Sciences, BRAC University
dc.description.degree	B. Mathematics

Files in this item

Name:: 19216002_MNS.pdf
Size:: 1.593Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Thesis (Bachelor of Science in Mathematics) [13]

Show simple item record