Performance analysis of machine learning algorithms for Malware classification
dc.contributor.advisor | Chakrabarty, Amitabha | |
dc.contributor.advisor | Rodoshi, Ahanaf Hassan | |
dc.contributor.author | Bushra, Raisa Hasan | |
dc.contributor.author | Alam, Md Taukir | |
dc.contributor.author | Saha, Aniruddho | |
dc.contributor.author | Fahim, Nazmus Sakib | |
dc.contributor.author | Binty, Nabila Mourium | |
dc.date.accessioned | 2023-10-15T10:39:29Z | |
dc.date.available | 2023-10-15T10:39:29Z | |
dc.date.copyright | ©2022 | |
dc.date.issued | 2022-09-29 | |
dc.identifier.other | ID 18301064 | |
dc.identifier.other | ID 18301277 | |
dc.identifier.other | ID 18201117 | |
dc.identifier.other | ID 18201166 | |
dc.identifier.other | ID 19101082 | |
dc.identifier.uri | http://hdl.handle.net/10361/21825 | |
dc.description | This thesis is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science and Engineering, 2022. | en_US |
dc.description | Cataloged from PDF version of thesis. | |
dc.description | Includes bibliographical references (pages 32-36). | |
dc.description.abstract | Malware detection research has been popular over the years as the variations and complexity of malware attacks are increasing daily. Using variously Supervised and Unsupervised machine learning algorithms to detect, identify, or classify malware attacks has been proven a very effective technique for some past years. Some com- mon and widely concerning malware attacks are Trojan, Adware, Ransomware, and Zero-day. In this paper, we used ten ML algorithms such as AdaBoost, Stochastic Gradient Descent (SGD), Naïve Bayes (NB), Decision Tree (DT), Random For- est (RF), XGBoost, Logistic Regression (LR), Multi-Layer Perceptron (MLP), K- Nearest Neighbour(KNN), Support Vector Machine (SVM) for classifying software- based Trojan attacks, Ransomware, Adware and Zero-day attacks. This research was conducted on a dataset having a total sample of 12863 malware, consisting of the malware categories mentioned above, to extract features and learn patterns. Also, we showed a comparison between these ML methods and analysis based on how they classify these popular malware in this paper after testing each classifier on the selected dataset. After implementation, RF achieved the highest accuracy of 86.97%, and Gaussian NB achieved the lowest accuracy of 47.84%. MLP, XGBoost, KNN, DT, AdaBoost, SVM, LR, SGD got 83.60%, 82.59%, 80.68%, 79.63%, 73.30%, 73.22%, 67.08%, 64.40% accuracy respectively. Other than accuracy, our analysis was based on individual accuracy, precision, and F1-score, TPR, TNR, FPR, and FNR of malware classes for each ML classifier. | en_US |
dc.description.statementofresponsibility | Raisa Hasan Bushra | |
dc.description.statementofresponsibility | Md Taukir Alam | |
dc.description.statementofresponsibility | Aniruddho Saha | |
dc.description.statementofresponsibility | Nazmus Sakib Fahim | |
dc.description.statementofresponsibility | Nabila Mourium Binty | |
dc.format.extent | 47 pages | |
dc.language.iso | en | en_US |
dc.publisher | Brac University | en_US |
dc.rights | Brac University theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. | |
dc.subject | Machine learning | en_US |
dc.subject | Trojan | en_US |
dc.subject | Adware | en_US |
dc.subject | Ransomware | en_US |
dc.subject | Classification | en_US |
dc.subject | Malware | en_US |
dc.subject | Zero-day | en_US |
dc.subject | Naïve Bayes | en_US |
dc.subject | Stochastic gradient descent | en_US |
dc.subject | Random forest | en_US |
dc.subject | Decision tree | en_US |
dc.subject | AdaBoost | en_US |
dc.subject | XGBoost | en_US |
dc.subject | Logistic regression | en_US |
dc.subject | Multi-layer perceptron | en_US |
dc.subject | K- nearest neighbour | en_US |
dc.subject | Support vector machine | en_US |
dc.subject.lcsh | Regression analysis | |
dc.subject.lcsh | Computer algorithms | |
dc.title | Performance analysis of machine learning algorithms for Malware classification | en_US |
dc.type | Thesis | en_US |
dc.contributor.department | Department of Computer Science and Engineering, Brac University | |
dc.description.degree | B.Sc. in Computer Science |