View on GitHub

Malware classification based on graph convolutional neural networks and static call graph features

About

Authors

Abstract

" Advanced Persistent Threats (APT) are targeted, high level cybersecurity risk factors facing governments, financial units and other organizations. The attribution of APTs - gathering information about the origin of an attack - is an important key in the process of securing an organisation’s infrastructure, prioritizing the measures to be taken depending on the actor(s) targeting the organisation. In practice, an elementary step in the process of attribution is determining the family and/or author of a sample, based on the binary file and/or its dynamic analysis - i.e. a multi-class classification problem regarding the family/author label. There are numerous methods in the literature aimed to label a sample based on its control flow graph or API sequence graph. We aim to summarize the literature on these methods, and offer another method to classify malware families leveraging the static call graph of a PE executable, as well as the functions’ instruction lists, using a locality- sensitive hashing method to obtain the node feature vectors. Our results are compared to recent publications in the field. "

Keywords

static malware analysis, static call graph, graph convolutional neural networks, locality-sensitive hashing, family classification

Figures

Fig. 1.Outline of the project: malware family classification using graph convolutional neural networks trained on static call graph
Fig. 2.Histogram of number of nodes and edges in a call graph.