Loan approval prediction using machine learning algorithms
Abstract
This research describes the potential of several classifiers of classical machine learning
and architecture of deep neural networks when predicting the status of a loan
application. The data set of 613 observations and 13 features, provided with the
information about the applicants and their credit profiles, was utilized together with
other techniques, such as bootstrapping, for more data qualityutimaltely leading to
9824 observations. Some imputation strategies were applied to deal with the lack of
values, while also features were carefully prepared by employing ANOVA, Mutual
Information and Tree based approaches among other statistical methods. For the
validation of the model performance, the dataset was split into two parts: training
(70%) and testing (30%). Many classical machine learning algorithms were applied
including but not limited to Logistic Regression, Support Vector Classifiers(SVC),
Decision Trees, Random Forests, Multi-Layer Perceptron, Gradient Boosting machines,
K-Nearest Neighbors, etc. Out of all models used in the research, Random
Forest Classifier demonstrated the most high values of accuracy of 86.84% and F1-
score (0.9043), hence it was the best performing one. Advanced methodologies such
as SMOTE (accuracy of 88.16%) and ADASYN (accuracy of 87.07% )were also
used to handle the issue of class imbalance, where the performance of K- Nearest
Neighbors was impressive acuuracy of 88.16% after resampling. In a different, yet
similar analysis, five types of neural network architectures, Simple Recurrent Neural
Network(RNN), Long-Short Term Memory(LSTM), Convolutional Neural Networks(
CNN), Fully Connvolutional Neural Networks(FCNN) and Fully Connected
Neural Networks(FCN) were built with the use of Tensorflow, Scikit-learn, and
Numpy running on Google Colaboratory notebooks. The outcomes showed that
the Fully Convolutional Network (FCN) has the best validation accuracy of 89.75%
and validation loss of 0.2255 among the models built.