Computer Engineering and Applications ›› 2021, Vol. 57 ›› Issue (22): 131-138.DOI: 10.3778/j.issn.1002-8331.2007-0291

• Network, Communication and Security • Previous Articles     Next Articles

Malware Family Classification Based on Deep Learning Visualization

CHEN Xiaohan, WEI Shuning, QIN Zhengze   

  1. 1.College of Information Science and Engineering, Hunan Normal University, Changsha 410006, China
    2.National Key Laboratory for Parallel and Distributed Processing, National University of Defense Technology, Changsha 410006, China
  • Online:2021-11-15 Published:2021-11-16



  1. 1.湖南师范大学 信息科学与工程学院,长沙 410006
    2.国防科技大学 并行与分布处理国防科技重点实验室,长沙 410006


The rapid development of computer network technology has led to an increasing number of malicious software. Aiming at the problem of malware family classification, a method of malware family classification based on deep learning visualization is proposed. In this method, the malware opcodes are converted into gray images that can be viewed directly. By using Recursive Neural Network(RNN) to process opcode sequences, this paper take into account not only the original information of malware, but also the ability to associate the original code with timing characteristics, thus enhancing the information density of the classified features. Then, SimHash is used to generate feature images from the fusion of the original codes and the predictive codes from the RNN. Finally, malicious code images based on the same family are more similar than those of different families. The traditional classification model can’t finish automatic extraction of classification features. To address this problem, this paper uses Convolutional Neural Network(CNN) to classify the feature images. The method has been implemented and tested on a set of 10868 malware instances in 9 families, the classification accuracy achieves 98.8%, and the effective and information-enhanced classification features could be obtained.

Key words: malware family, malicious code visualization, Recursive Neural Network(RNN), Convolutional Neural Network(CNN), SimHash



关键词: 恶意软件家族, 恶意代码可视化, 递归神经网络(RNN), 卷积神经网络(CNN), SimHash