Skip to content

Latest commit

 

History

History
53 lines (34 loc) · 1.74 KB

File metadata and controls

53 lines (34 loc) · 1.74 KB

Malware_Classification_Final_Project

This project is inspired by the winners of "Microsoft Malware Classification Challenge". The objective of our project is to classify executable files into benign files or to one of nine malicious file classes.

In order to achive our goal we used two models:

  • Machine Learning - Our main feature was based on opcode count: we read disassembly of EXE files and then splited them into n-grams . We used XGBoost package (an implementation of gradient boosted decision trees) in order to construct different decision trees and combine them into an Improved model.

  • Deep Learning - we implimented a convolutional neural network based on Raff’s groundbreaking paper: 'Malware Detection by Eating a Whole EXE'.

Results:

We examined files that can be categorized into ten different classes (one bengin class and nine malware classes). Moreover, we ensured that each class received equal representation in the test set, so we can make sure that the model doesn't classifies all the files into the same class.

Machine learning:

Accuracy Average loss
Train set 99.487231% 0.013942
Test set 94.611516% 0.249856

ml

Deep Learning:

Accuracy Average loss
Train set 99.256321% 0.025617
Test set 91.666667% 0.368867

dl graph

Requirements:

Machine Learning:

  • xgboost
  • numpy
  • sklearn
  • pydasm

Deep Learning:

  • pytorch
  • numpy