计算机工程与应用 ›› 2020, Vol. 56 ›› Issue (11): 98-104.DOI: 10.3778/j.issn.1002-8331.1902-0102

• 网络、通信与安全 • 上一篇    下一篇

基于行为路径树的恶意软件分类方法

金炳初,文辉,石志强,张智渊,陈俊杰   

  1. 1.太原理工大学 信息与计算机学院,太原 030024
    2.中国科学院 信息工程研究所 物联网信息安全技术北京市重点实验室,北京 100195
    3.中国科学院大学 网络空间安全学院,北京 100195
  • 出版日期:2020-06-01 发布日期:2020-06-01

Malware Classification Method Based on Path Tree of Behavior

JIN Bingchu, WEN Hui, SHI Zhiqiang, ZHANG Zhiyuan, CHEN Junjie   

  1. 1.College of Information and Computer Science, Taiyuan University of Technology, Taiyuan 030024, China
    2.Beijing Key Laboratory of IOT Information Security, Institute of Information Engineering, Chinese Academy of Sciences, Beijing 100195,China
    3.School of Cyber Security, University of Chinese Academy of Sciences, Beijing 100195, China
  • Online:2020-06-01 Published:2020-06-01

摘要:

针对恶意软件家族分类问题,提出一种基于行为路径树的恶意软件分类方法,该方法使用恶意样本细粒度行为路径作为动态特征,通过将路径转化为树型结构的方式生成依赖关系,与传统基于系统调用的恶意软件分类相比,具有较低的复杂度。此外,针对传统分类模型无法解决行为路径树深度寻优问题,设计了基于自适应随机森林的分类模型,该模型采用随机逼近的方式完成行为路径树深度寻优。实验部分使用2?588个样本(包含8个恶意家族,1个良性集合)对行为路径树的有效性进行验证,分类精度达到91.11%。

关键词: 行为路径树, 恶意软件分类, 动态特征, 自适应随机森林

Abstract:

This paper proposes a fast malware classification method, which uses a novel dynamic feature based on path tree of behavior. This method performs dynamic analysis to obtain the fine granularity behavior execution traces of malware as the behavior path. These behavior paths are then structured as a tree, which characterizes the dependencies of behavior. It has lower complexity compared with traditional system call-based malware classification.In addition, the traditional classification model can’t finish depth optimization in path tree. To address this problem, this paper builds a classification model based on self-adaptation random forest. The method has been implemented and tested on a set of 2,588 malware instances in 9 families, and the classification accuracy achieves 91.11%.

Key words: path tree of behavior, malware classification, dynamic feature, self-adaptation random forest