Comparative analysis of machine learning models for the prediction of asthma disease among the cardiovascular disease patients
Abstract
Cardiovascular diseases (CVD) are a leading cause of morbidity and mortality worldwide, and recent studies have highlighted a potential association between CVD and the development of asthma. Predicting the likelihood of asthma in patients with cardiovascular diseases is crucial for early intervention and effective management. Advances in medical technology, particularly in machine learning (ML), offer powerful tools for disease prediction. ML algorithms, a subset of Artificial Intelligence (AI), mimic human learning processes to train systems for predictive tasks. This study employs supervised classification ML algorithms, including Logistic Regression, K-Nearest Neighbour (KNN), Naïve Bayes, Decision Tree, and Random Forest, to predict the likelihood of asthma in individuals with cardiovascular diseases. The dataset comprises primary data collected from adults, including demographic information, medical history, and relevant health indicators. The data was meticulously cleaned to ensure accuracy. Using RapidMiner, we developed predictive models and generated confusion matrices for each algorithm to evaluate their performance. Our analysis revealed that Logistic Regression, Naïve Bayes, Decision Tree, and Random Forest models achieved an accuracy of 84.78%, while KNN reached an accuracy of 79.86%. Despite their high accuracy, the models exhibited low recall rates, indicating a challenge in identifying true positive cases of asthma. Naïve Bayes demonstrated the highest precision, followed by Logistic Regression, Random Forest, and Decision Tree, with KNN trailing behind. Consistent PVN scores across most models underscored their reliability in predicting negative cases. The comparative analysis emphasizes the need to consider multiple performance metrics beyond accuracy for a holistic evaluation of predictive models. Our findings suggest that Random Forest, Naïve Bayes, and Logistic Regression are the most promising algorithms for predicting asthma likelihood in cardiovascular disease patients. However, further refinements and hyperparameter tuning are necessary to enhance recall rates and overall predictive performance. This study lays the groundwork for using machine learning in predicting asthma risks among cardiovascular disease patients, aiming to improve early detection and intervention strategies in clinical practice.