Malware Classification

Malware Classification

Malware Classification is a workflow that makes use of Machine Learning to classify unknown Windows Portable Executable files.

Setup

This project is developed using KNIME, a open source data analytics platform. As such, before starting to classify malware, you’ll need to download and install this software in order to use the workflow.

After that, there is some more setup needed. First, the model needs to be trained and for that it needs sample. There are three directories in the workspace:

  • Data/Benign
  • Data/Malicious
  • Data/Unknown

These directories should contain known benign PE files, known malicious PE files, and unknown malicious PE files respectively.

After you have acquired the benign and malicious samples needed to train the model, you need to download the pecoff4j Java library into the Library directory. The workspace is configured to use version 0.0.2.1.

Classify Samples

Just drop the PE files in the Data/Unknown directory and run the workflow. For the source code and more information check the link.