Performance analysis of machine learning classi ers for detecting PE malware
Abstract
In this modern era of technology, securing and protecting one's data has been a
major concern and needs to be focused on. Malware is a program that is designed
to cause harm and malware analysis is one of the paramount focused points under
the sight of cyber forensic professionals and network administrations. The degree
of the harm brought about by malignant programming varies to a great extent. If
this happens at home to a random person then that may lead to some loss of irrel-
evant or unimportant information but for a corporate network, it can lead to loss
of valuable business data. The existing research does focus on some few machine
learning algorithms to detect malware and very few of them worked with Portable
Executables (PE) les. However, we worked on the PE les and also for real-time
computation, a client-server model was developed by using Flask to detect malware
or benign. In this paper, we mainly focused on top classi cation algorithms and
compare their accuracy to nd out which one is giving the best result according to
the dataset and also compare among these algorithms. Top machine learning clas-
si cation algorithms were used alongside neural networks such as Arti cial Neural
Network, XGBoost, Support Vector Machine, Extra Tree Classi er, etc. The exper-
imental result shows that XGBoost achieved the highest accuracy of 98.62 percent
when compared with other approaches. Thus, to provide a better solution for this
kind of anomalies, we have been interested in researching malware detection and
want to contribute to building strong and protective cybersecurity.