计算机工程与应用 ›› 2025, Vol. 61 ›› Issue (20): 306-314.DOI: 10.3778/j.issn.1002-8331.2406-0300

• 网络、通信与安全 • 上一篇    下一篇

基于CBAM-GLU-ISF的多模态融合恶意软件检测方法

彭飞鸿,刘万平,黄东   

  1. 1.重庆理工大学 计算机科学与工程学院,重庆 400054
    2.贵州大学 现代制造技术教育部重点实验室,贵阳 550025
  • 出版日期:2025-10-15 发布日期:2025-10-15

Multimodal Fusion Malware Detection Method Based on CBAM-GLU-ISF

PENG Feihong, LIU Wanping, HUANG Dong   

  1. 1.College of Computer Science and Engineering, Chongqing University of Technology, Chongqing 400054, China
    2.Key Laboratory of Advanced Manufacturing Technology of the Ministry of Education, Guizhou University, Guiyang 550025, China
  • Online:2025-10-15 Published:2025-10-15

摘要: 恶意软件普遍结合代码混淆技术,基于单特征的检测方法特征信息受限,相对多特征检测方法存在检测准确率偏低的问题。不同模态之间潜在的关联性信息可以提高检测效果上限,目前的多特征检测方法在特征融合时缺乏对特征之间关联性的建模,导致其检测准确率欠佳。为了更全面地表征恶意软件并提高检测精度,提出一种基于卷积神经网络CNN和门控线性单元GLU的多模态融合恶意软件检测方法CBAM-GLU-ISF。以恶意软件两种模态:灰度图和字节序列为分析对象,在卷积神经网络中添加卷积块注意力模块(CBAM),结合通道注意力和空间注意力实现对灰度图的关键特征提取。字节序列是软件在计算机上最直接的表示,门控线性单元结合加性注意力机制(additive attention)在高效地捕获长序列依赖关系的基础上,实现对字节序列关键特征的提取。多模态特征融合模块(ISF)对并行特征提取网络的两种模态特征进行融合,挖掘利用两种模态特征之间存在的关联性信息,将恶意软件表征为一个更全面的多模态特征。最后,通过检测层完成恶意软件识别。实验结果表明,所提方法检测准确率达到99.1%,AUC达到了99.8%,对比现有工作中的单特征和多特征检测算法有明显提升,验证了该方法的有效性。

关键词: 恶意软件检测, 多模态融合, 卷积神经网络(CNN), 卷积块注意力模块, 门控线性单元(GLU)

Abstract: Malware generally uses code obfuscation technology. The detection method based on single-feature has limited feature information and has lower detection accuracy than the multi-feature detection method. The potential correlation information between different modalities can improve the upper limit of the detection effect. The current multi-feature detection method lacks modeling of the correlation between features when merging features, resulting in subpar detection accuracy. In order to more comprehensively characterize malware and improve detection accuracy, a multimodal fusion malware detection method CBAM-GLU-ISF based on convolutional neural network(CNN) and gated linear unit (GLU) is proposed. Taking the two modalities of malware: grayscale image and byte sequence as the analysis objects, first, the convolutional block attention module (CBAM) is added to the convolutional neural network to extract the key features of the grayscale image by combining channel attention and spatial attention. The byte sequence is the most direct representation of software on the computer, and the gated linear unit combined with the additive attention mechanism can extract the key features of the byte sequence on the basis of efficiently capturing the long sequence dependency. Then, the multimodal feature fusion module (ISF) fuses the two modal features of the parallel feature extraction network, exploits the correlation information between the two modal features, and represents the malware as a more comprehensive multimodal feature. Finally, malware identification is done through the detection layer. Experimental results show that the detection accuracy of the proposed method reaches 99.1%, and the AUC reaches 99.8%. Compared with the single-feature and multi-feature detection algorithms in existing work, the proposed method shows a significant improvement in accuracy, verifying the effectiveness of the method.

Key words: malware detection, multimodal fusion, convolutional neural network(CNN), convolutional block attention module, gated linear unit(GLU)