A machine learning approach to credit default prediction and Individual credit scoring
Abstract
In our country the credit scoring system is not in practice yet so as for our undergrad thesis, we have taken upon the challenge of delivering a model well equipped with machine learning techniques to predict loan defaults. Here our main goal is to forecast credit defaults using machine-learning techniques and so we developed a model to output a target score, known as “credit score” which will describe the trustworthiness of an individual for getting a loan. We trained and tested this model based on ‘German credit data’, which was modified later on. We have Figured out 37 features based on which the data were taken and then after feature selection, we narrowed the number to 23 only by means of feature selection. Then again after thorough observations we analyzed the dataset with different models like Logistic Regression, FLDA, Naïve Bayes, Decision tree, Gradient Boosting tree, Random Forest etc. After that we made a scoring format using weights derived from information gains and also depending on their correlations, which will ensure the assigning of credit score to an individual. Later on we predicted who should receive loan on basis of the scores generated and this prediction was done using a decision tree.