Detecting online recruitment fraud by using machine learning

Ghosh, Gitanjali; Tabassum, Hridita; Atika, Afra; Kutubuddi, Zainab

View/Open

17101228, 17101446, 17101206, 17101198_CSE.pdf (1.314Mb)

Date

2021-01

Publisher

Brac University

Abstract

Online Recruitment fraud (ORF) is becoming an important issue in the cyber-crime region. Companies find it easier to hire people with the help of the internet rather than the old traditional way. But it has greatly attracted the scammers to deceive people and exploit their information. There have been lots of incidents where innocent people have fallen for this malicious fraud and lost millions of money. Even it causes harm to business and the economy. Unlike other cyber-security problems, like email spam, phishing, opinion fraud, detecting Online Recruitment Fraud(ORF) did not get that much of recognition. So, this matter needed to be highlighted more. In this paper, we have proposed a solution on how to detect ORF. We have presented our results based on the previous model and also presented the methodologies which we are going to use to create the ORF detection model where we are using our own dataset. We are going to use a publicly accessible dataset from fake job postings.csv, license-CC0: Public Domain, as a reference for the dataset that we have created. Furthermore, we have collected 4000 data from different job sites in Bangladesh, among which 301 of them are fraudulent. We have used many common and latest classification models to detect which algorithm works best for our model. Logistic Regression, AdaBoost, Decision Tree Classifier, Random Forest, Voting Classifier, LightGBM, Gradient Boosting are the algorithms that have been used. From our observations we have found that the accuracy of different prediction models are: Logistic Regression(94.67%), AdaBoost(95%), Decision Tree Classifier(95%), Random Forest(95%), Voting Classifier(95.34%), LightGBM(95.17%), Gradient Boosting(95.17%). Through this report, we tried to create a precise way for detecting the fraudulent hiring posts.

Keywords

Machine Learning; Fraud Detection; Prediction; Decision Tree Classifier; Logistic Regression algorithm; Adaptive Boosting; Random Forest Classifier; Decision trees; Gradient Boost; LightGBM

LC Subject Headings

Machine learning

Description

This thesis is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science and Engineering, 2021.

Cataloged from PDF version of thesis.

Includes bibliographical references (page 41-43).

Department

Department of Computer Science and Engineering, Brac University

Type

Thesis

Collections

Thesis & Report, BSc (Computer Science and Engineering) [1589]