计算机工程与应用 ›› 2021, Vol. 57 ›› Issue (10): 81-87.DOI: 10.3778/j.issn.1002-8331.2003-0397

• 网络、通信与安全 • 上一篇    下一篇

改进的基于DNN的恶意软件检测方法

张柏翰,凌捷   

  1. 广东工业大学 计算机学院,广州 510006
  • 出版日期:2021-05-15 发布日期:2021-05-10

Improved Malware Detection Method Based on DNN

ZHANG Bohan, LING Jie   

  1. School of Computer, Guangdong University of Technology, Guangzhou 510006, China
  • Online:2021-05-15 Published:2021-05-10

摘要:

当前基于深度学习的恶意软件检测技术由于模型结构及样本预处理方式不够合理等原因,大多存在泛化性较差的问题,即训练好的恶意软件检测模型对不属于训练样本集的恶意软件或新出现的恶意软件的检出效果较差。提出一种改进的基于深度神经网络(Deep Neural Network,DNN)的恶意软件检测方法,使用多个全连接层构建恶意软件检测模型,并引入定向Dropout正则化方法,在模型训练过程中对神经网络中的权重进行剪枝。在Virusshare和lynx-project样本集上的实验结果表明,与同样基于DNN的恶意软件检测模型DeepMalNet相比,改进方法对恶意PE样本集的平均预测概率提高0.048,对被加壳的正常PE样本集的平均预测概率降低0.64。改进后的方法具有更好的泛化能力,对模型训练样本集外的恶意软件的检测效果更好。

关键词: PE文件, 恶意软件检测, 深度学习, 神经网络, 深度神经网络(DNN)

Abstract:

Most of the current deep-learning-based malware detection methods have the problem of poor generalization caused by the model structures and sample preprocessing methods that are not suitable enough. In other words, the trained malware detection models might have a poor detection effect on those malwares that are not included in the training sample set or those newly emerged malwares. This paper proposes an improved Deep Neural Network(DNN) based malware detection method, which uses multiple fully connected layers to build a malware detection model, and introduces a directional Dropout regularization method to prune the weights in the neural network during the model training process. The experimental results on the Virusshare dataset and the lynx-project sample set show that, compared with another DNN based malware detection model DeepMalNet, the proposed model attains an average predicted probability on the malicious PE sample set that is increased by 0.048, and an average predicted probability on the packed normal sample set that is decreases by 0.64. The results indicate that the proposed method has a better generalization ability, and a better detection effect on malwares outside the training sample set.

Key words: PE file, malware detection, deep learning, neural network, Deep Neural Network(DNN)