Analysis of malware prediction based on infection rate using machine learning techniques
Abstract
In this modern, technological age, the internet has been adopted by the masses. And
with it, the danger of malicious attacks by cybercriminals have increased. These
attacks are done via Malware, and have resulted in billions of dollars of financial
damage. Which is why prevention of malware attacks has become an essential part
of the battle against cybercrime. In recent years, Machine Learning has become an
important tool in the field of Malware Detection, which is the first step towards
removing malware from infected devices. In this thesis, we are applying machine
learning algorithms to predict the malware infection rates of computers based on its
features. We are using supervised machine learning algorithms and gradient boosting
algorithms, such as LightGBM, Neural Networks, and Decision Tree Learning. We
have collected a publicly available dataset, which was divided into two parts, one
being the training set, and the other will be the testing set. After conducting four
different experiments using the aforementioned algorithms, it has been discovered
that LightGBM is the best model with an AUC Score of 0.73926.