计算机工程与应用 ›› 2025, Vol. 61 ›› Issue (7): 353-360.DOI: 10.3778/j.issn.1002-8331.2311-0420

• 工程与应用 • 上一篇    下一篇

基于交叉注意力的点击率预测模型

何李杰,高茂庭   

  1. 上海海事大学 信息工程学院,上海 201306
  • 出版日期:2025-04-01 发布日期:2025-04-01

Click Through Rate Prediction Model Based on Cross Attention

HE Lijie, GAO Maoting   

  1. College of Information Engineering, Shanghai Maritime University, Shanghai 201306, China
  • Online:2025-04-01 Published:2025-04-01

摘要: 有效特征的挖掘是点击率预测的关键,针对点击率预测模型对局部组合特征与全局特征间的信息交流考虑不充分,对组合特征的重要度表达不准确的问题,提出一种基于交叉注意力的点击率预测模型(CANN),通过交叉注意力机制突出组合特征与全局特征间的联系和交流,以充分挖掘有效特征。通过全局平均池化分别得到每个特征的特征值并拼接成全局特征;采用轴加权融合的方式来捕获组合特征;通过交叉注意力机制实现全局特征与组合特征交叉,得到组合特征的权重来表达其重要性,并将加权的组合特征融合到全局特征,以提高信息交流;通过多层感知机学习,得到点击率预测值。在两个公开和真实的数据集上的实验结果验证了该模型的有效性。

关键词: 点击率预测, 交叉注意力, 特征交互, 神经网络

Abstract: Mining effective features is the key for click through rate prediction. To address the problem of insufficient consideration of information exchange between local combined features and global feature, and inaccurate expression of the importance for the combined features in click through rate prediction models, a click through rate model based on cross attention (CANN) is proposed to emerge out the connection and communication between combined features and global feature by using cross attention mechanism to fully mine effective features. Firstly, each feature eigenvalue is respectively obtained by global average pooling, and all eigenvalues are concatenated to form a global feature. Then, the combination features are captured by axis weighted fusion. Next, the cross operation between the global feature and the combination features is performed by the cross attention mechanism, the weights of the combination features is obtained to express importance, and the weighted combination features are fused to the global feature to improve information exchange. Finally, the click through rate prediction value is obtained after learning by the multi-layer perceptron. Experiments on two public and real datasets verify the effectiveness of the model.

Key words: click through rate prediction, cross attention, feature interaction, neural network