计算机工程与应用 ›› 2022, Vol. 58 ›› Issue (24): 158-165.DOI: 10.3778/j.issn.1002-8331.2106-0001

• 模式识别与人工智能 • 上一篇    下一篇

增强型高阶注意力因子分解机点击率预测模型

陈育康,龙慧云,吴云,林建   

  1. 贵州大学 计算机科学与技术学院,贵阳 550025
  • 出版日期:2022-12-15 发布日期:2022-12-15

Click-Through Rate Prediction Model of Enhanced High-Order Attentive Factorization Machine

CHEN Yukang, LONG Huiyun, WU Yun, LIN Jian   

  1. School of Computer Science and Technology, Guizhou University, Guiyang 550025, China
  • Online:2022-12-15 Published:2022-12-15

摘要: 特征交互的建模对于推荐系统中预测用户的点击率至关重要。设计了增强型高阶注意力因子分解机模型EHAFM(enhanced high-order attentive factorization machine),其主要由Embedding层、显式特征交互层、输出层构成。在每个显式特征交互层中,通过聚合其他特征的表示来更新特征的表示,并针对无用特征交互对预测产生干扰的问题,提出了增强型元素级注意力机制,利用投影矩阵拓展特征表示空间,以增强注意力矩阵的学习能力。通过融合多个增强型元素级注意力头的信息,以解决模型泛化能力不足问题。通过堆叠显式特征交互层可以将特征表示更新到任意高阶,将高阶特征交互部分与一阶线性部分结合进行点击率预测。EHAFM模型在Criteo、Movielens-1M两个数据集上进行实验,结果表明相较基准模型在两个数据集上分别有0.21%和0.92%的AUC提升。

关键词: 深度学习, 显式特征交互, 高阶因子分解机, 注意力机制

Abstract: Modeling feature interactions is critical for predicting user click-through rate in recommendation system. This paper designs an enhanced high-order attention factorization machine (EHAFM) model, which is mainly composed of embedding layer, explicit feature interactions layer, and output layer. In each explicit feature interaction layer, the feature representation is updated by aggregating other feature representations, and for the problem that useless feature interactions interferes with the prediction, an enhanced bit-wise attention mechanism is proposed, it uses the projection matrix to expand the feature representation space, to enhance the learning ability of the attention matrix. In addition, the problem of insufficient model generalization ability is solved by fusing the information of multiple enhanced bit-wise attention heads. By stacking explicit feature interactions layers, the feature representations can be updated to any high-order, and finally the high-order feature interactions part and the first-order linear part are combined to predict the click-through rate. The EHAFM model is tested on two datasets, Criteo and Movielens-1M, the results show that compared with the benchmark model, there are 0.21% and 0.92% AUC improvements on the two datasets, respectively.

Key words: deep learning, explicit feature interactions, high-order factorization machine, attention mechanism