Malware_Classification_Final_Project

This project is inspired by the winners of "Microsoft Malware Classification Challenge". The objective of our project is to classify executable files into benign files or to one of nine malicious file classes.

In order to achive our goal we used two models:

Machine Learning - Our main feature was based on opcode count: we read disassembly of EXE files and then splited them into n-grams . We used XGBoost package (an implementation of gradient boosted decision trees) in order to construct different decision trees and combine them into an Improved model.
Deep Learning - we implimented a convolutional neural network based on Raff’s groundbreaking paper: 'Malware Detection by Eating a Whole EXE'.

Results:

We examined files that can be categorized into ten different classes (one bengin class and nine malware classes). Moreover, we ensured that each class received equal representation in the test set, so we can make sure that the model doesn't classifies all the files into the same class.

Machine learning:

	Accuracy	Average loss
Train set	99.487231%	0.013942
Test set	94.611516%	0.249856

Deep Learning:

	Accuracy	Average loss
Train set	99.256321%	0.025617
Test set	91.666667%	0.368867

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Malware_Classification_Final_Project

Results:

Machine learning:

Deep Learning:

Requirements:

Machine Learning:

Deep Learning:

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

Malware_Classification_Final_Project

Results:

Machine learning:

Deep Learning:

Requirements:

Machine Learning:

Deep Learning: