Comparison of machine learning techniques to predict cardiovascular disease
MetadataShow full item record
The purpose of this thesis is to examine and compare the accuracy of different data mining classication systems through different machine learning techniques to predict cardiovascular disease. This comparison shows the different accuracy rates of different techniques and reasons behind their variations. The Cleveland dataset for heart diseases has been used in this study which contains 303 instances. The data has been divided into two sections named as training and testing datasets. The 10- fold Cross Validation has been used here in order to work with the expanded dataset. The k-Nearest Neighbors, Support Vector Machine, Decision Tree, Random Forest, Gaussian Naive Bayes, Logistic Regression and Deep Belief Network machine learning techniques have been investigated in this research. Besides, ensemble learning method voting classifier has been applied on the data set. By the end of the implementation part, we have found Gaussian Naive Bayes is giving the maximum accuracy in our dataset and deep belief network is performing very poor. The reasons of variations of these different techniques by analyzing their characteristics and behavior with respect to the dataset has been understood by the study conducted for this thesis.