计算机工程与应用 ›› 2020, Vol. 56 ›› Issue (14): 88-92.DOI: 10.3778/j.issn.1002-8331.1904-0420

• 网络、通信与安全 • 上一篇    下一篇

基于RNN的Webshell检测研究

周龙,王晨,史崯   

  1. 1.武汉邮电科学研究院,武汉 430000
    2.南京烽火软件科技股份有限公司,南京 210000
  • 出版日期:2020-07-15 发布日期:2020-07-14

Research of Webshell Detection Based on RNN

ZHOU Long, WANG Chen, SHI Yin   

  1. 1.Wuhan Research Institute of Posts and Telecommunications, Wuhan 430000, China
    2.Nanjing Fiberhome Software Polytron Technologies Inc, Nanjing 210000, China
  • Online:2020-07-15 Published:2020-07-14

摘要:

近年来,互联网行业发展迅速,网络安全的重要性与日俱增。网络安全领域涉及到各种问题,比如恶意代码检测、攻击溯源等,而Webshell作为一种恶意代码,也得到了学术界和业界的关注。Webshell的检测方法除了简单低效的关键词匹配之外,还有各种机器学习算法。Webshell代码经过逃逸技术处理之后,基于关键词匹配的检测算法无法有效检测出Webshell,常规的机器学习算法不能提取深层特征,检测准确率不高。因此,提出基于RNN的Webshell检测方法。实验结果表明,该方法在准确率、漏报率、误报率等指标上优于传统的机器学习算法。

关键词: 网络安全, Webshell, 深度学习

Abstract:

In recent years, the Internet industry has developed rapidly, the importance of cybersecurity is increasing, as time goes by. The field of network security involves many problems, such as malicious code detection, attack tracing, etc. Webshell is a kind of malicious code, has also received some attention from the academic community and industry. In addition to simple and effective keywords matching, some machine learning algorithms have been used to detect Webshell. After the program code of Webshell is processed by escape technology, the Webshell cannot be detected by the algorithms based on keywords matching effectively. However, the deep features of Webshell cannot be extracted by conventional machine learning algorithms. As a consequence, the accuracy of classification may be not accepted by the academic community and industry. Therefore, a detection method based on the recurrent neural networks is proposed. The experimental results show that the proposed method in this paper outperforms the conventional machine learning algorithms, when the detection results are evaluated by classification accuracy, rate of missing report and false alarm rate. The algorithm proposed in this paper can be further improved, because the sequence characteristics are extracted, and the false report rate is high slightly.

Key words: network security, Webshell, deep learning