计算机工程与应用 ›› 2025, Vol. 61 ›› Issue (3): 315-325.DOI: 10.3778/j.issn.1002-8331.2405-0075

• 网络、通信与安全 • 上一篇    下一篇

AE-EM:一种期望最大化Web入侵检测算法

尹兆良,黄于欣,余正涛   

  1. 1.昆明理工大学 信息工程与自动化学院,昆明 650500
    2.云南省人工智能重点实验室,昆明 650500
    3.国家计算机网络与应急技术处理协调中心 云南分中心,昆明 650100
  • 出版日期:2025-02-01 发布日期:2025-01-24

AE-EM: Web Intrusion Detection Algorithm Based on Expectation Maximization

YIN Zhaoliang, HUANG Yuxin, YU Zhengtao   

  1. 1.Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, China
    2.Key Laboratory of Artificial Intelligence in Yunnan Province, Kunming 650500, China
    3.Yunnan Branch of National Computer Network Emergency Response Technical Team/Coordination Center of China, Kunming 650100, China
  • Online:2025-02-01 Published:2025-01-24

摘要: 现有的入侵检测算法集中在模式匹配、阈值分割法和多层感知机等机器学习和以神经网络深度学习方法上,在处理基于签名和异常的入侵时效果显著,但耗时费力。在面对Web入侵场景时,现有方法将检测模式重心放在网络流量分析(NTA)上,对URL携带的负载信息和流量之间的关联语义信息提取不足,异常检测效果有待提升。提出一种无监督算法,名为注意力扩展期望最大化算法(attention expand expectation-maximization algorithm,AE-EM),该算法提取应用层URL中的攻击负载语义,采用Attention机制混合编码网络层流量结构化数据,训练融合多维特征和关联应用层语义的向量作为算法的输入,使用轻量化期望最大化算法估计高斯混合模型的参数,用于网络安全入侵检测的Web入侵检测场景。通过在基线数据集上使用常用的学习算法和消融实验比较,提出的AE-EM算法在Web入侵检测领域准确率和性能上优于传统算法。

关键词: 入侵检测, Web攻击检测, 注意力机制, EM算法, AE-EM算法

Abstract: Existing intrusion detection algorithms focus on machine learning and deep learning methods such as pattern matching, threshold segmentation, and multilayer perceptions, which have shown significant effectiveness in handling intrusion based on signatures and anomalies but are time-consuming and labor-intensive. When facing Web intrusion scenarios, existing methods place the detection emphasis on network traffic analysis (NTA), but they lack the extraction of semantic information related to payload carried by URLs and the flow between traffic, resulting in room for improvement in anomaly detection effectiveness. In this paper, an unsupervised algorithm called attention expand expectation-maximization algorithm (AE-EM) is proposed. This algorithm extracts semantic information of attack payloads in application layer URLs, employs an attention mechanism to blend encoded network layer traffic structured data, trains a fused multidimensional feature and correlated application layer semantic vector as the input of algorithm, utilizes a lightweight expectation maximization algorithm to estimate parameters of Gaussian mixture models, and applies it to Web intrusion detection scenarios in network security intrusion detection. Through comparison with commonly used learning algorithms and ablation experiments, the proposed AE-EM algorithm outperforms traditional algorithms in accuracy and performance in the field of Web intrusion detection.

Key words: intrusion detection, Web attack detection, attention mechanism, expectation-maximization (EM) algorithm, attention expand expectation-maximization algorithm (AE-EM)