Detecting online recruitment fraud by using machine learning
Abstract
Online Recruitment fraud (ORF) is becoming an important issue in the cyber-crime
region. Companies find it easier to hire people with the help of the internet rather
than the old traditional way. But it has greatly attracted the scammers to deceive
people and exploit their information. There have been lots of incidents where innocent people have fallen for this malicious fraud and lost millions of money. Even it
causes harm to business and the economy. Unlike other cyber-security problems, like
email spam, phishing, opinion fraud, detecting Online Recruitment Fraud(ORF) did
not get that much of recognition. So, this matter needed to be highlighted more. In
this paper, we have proposed a solution on how to detect ORF. We have presented
our results based on the previous model and also presented the methodologies which
we are going to use to create the ORF detection model where we are using our own
dataset. We are going to use a publicly accessible dataset from fake job postings.csv,
license-CC0: Public Domain, as a reference for the dataset that we have created.
Furthermore, we have collected 4000 data from different job sites in Bangladesh,
among which 301 of them are fraudulent. We have used many common and latest
classification models to detect which algorithm works best for our model. Logistic
Regression, AdaBoost, Decision Tree Classifier, Random Forest, Voting Classifier,
LightGBM, Gradient Boosting are the algorithms that have been used. From our
observations we have found that the accuracy of different prediction models are:
Logistic Regression(94.67%), AdaBoost(95%), Decision Tree Classifier(95%), Random Forest(95%), Voting Classifier(95.34%), LightGBM(95.17%), Gradient Boosting(95.17%). Through this report, we tried to create a precise way for detecting the
fraudulent hiring posts.