Computer Engineering and Applications ›› 2020, Vol. 56 ›› Issue (24): 72-77.DOI: 10.3778/j.issn.1002-8331.1911-0001

Previous Articles     Next Articles

Malicious Code Family Detection Technology Based on CNN-BiLSTM

WANG Guodong, LU Tianliang, YIN Haoran, ZHANG Jianlin   

  1. School of Information Engineering and Cyber Security, People’s Public Security University of China, Beijing 100035, China
  • Online:2020-12-15 Published:2020-12-15

基于CNN-BiLSTM的恶意代码家族检测技术

王国栋,芦天亮,尹浩然,张建岭   

  1. 中国人民公安大学 信息技术与网络安全学院,北京 100035

Abstract:

Most of the rapidly increasing number of malicious code in recent years has been generated by mutations in the original family, so it is particularly important to detect and classify malicious code families. This paper proposes a malicious code family detection method based on CNN-BiLSTM network, which converts the malicious code family executable file into grayscale image directly, and uses CNN-BiLSTM network model to detect and classify the image dataset. This method comprehensively and efficiently extracts features while avoiding computer damage caused by malicious code. Combining the advantages of CNN and BiLSTM, it learns the characteristics of the malicious code family and classifies it from both local and global aspects. The experiment identifies 4, 418 samples of 4 malicious code families, and the results show that the model has higher accuracy than traditional machine learning.

Key words: malicious code, family, grayscale image, deep learning, neural networks

摘要:

近年来快速增加的恶意代码数量中大部分是由原有家族中通过变异产生,所以对恶意代码家族进行检测分类显得尤为重要。提出了一种基于CNN-BiLSTM网络的恶意代码家族检测方法,将恶意代码家族可执行文件直接转换为灰度图像,利用CNN-BiLSTM网络模型对图像数据集进行检测分类。此方法在避免计算机受到恶意代码伤害的同时全面高效地提取特征,结合CNN和BiLSTM的优点从局部和全局两个方面学习恶意代码家族的特征并实现分类。实验对4个恶意代码家族的4 418个样本进行识别,结果表明该模型相对于传统机器学习具有更高的准确率。

关键词: 恶意代码, 家族, 灰度图, 深度学习, 神经网络