Early detection of chronic kidney disease using machine learning
Abstract
Chronic kidney disease (CKD) is a global prevalent ailment that causes lives in a predominant
number. CKD is the 11th most deadly cause of global mortality with 1.2 million death each year
and according to kidney Foundation of Bangladesh, around 40,000 CKD people experienced
kidney failure annually as well as several thousand passed away in short stage of life because of
CKD. Predictive analytics for healthcare using machine learning is a challenged task to help
doctors decide the exact treatments for saving lives. Scientist researched collaboratively chronic
kidney diseases, with the majority of their work on pure statistical models, generating numerous
gaps in the development of machine-learning models. In this article we discussed the current
methods and suggested improved technology based on the XGBoost (Extreme Gradient Boost),
which combined significant characteristics of the F scores and evaluated four pre-processing
scenarios. In addition, we provided machine training methods for anticipating chronic renal
disease with clinical information. Four techniques of master teaching are explored including
Support Vector Regressor (SVR), logistic Regressor (LR), AdaBoost, Gradient Boosting Tree
and Decision Tree Regressor. The components are made from UCI dataset of chronic kidney
disease and the results of these models are compared to determine the best regression model for
the prediction. From this four preprocessing cases, replacing missing values with mean values of
each column and choosing important features was most logical as it allows to train with more
data without dropping. However, XGBoost gave the best outcomes in all four cases where it
obtained 98% accuracy in case one where nulled valued are dropped, 98.75% testing accuracy
for both case two and three where null values were replaced with minimum and maximum values
of each column and it scores 100% accuracy in case four where null values are replaced with
mean values. Thus, the system can be implemented
v
for early stage CKD prediction in a cost efficient way which will be helpful for under developed
and developing countries.