计算机工程与应用 ›› 2025, Vol. 61 ›› Issue (17): 251-258.DOI: 10.3778/j.issn.1002-8331.2407-0299

• 模式识别与人工智能 • 上一篇    下一篇

融合话题信息和语境不协调信息的讽刺识别模型

段玉俊,张顺香,钱龙海,文华,丁远远,葛唱   

  1. 1.安徽理工大学 计算机科学与工程学院,安徽 淮南 232001
    2.合肥综合性国家科学中心 人工智能研究院,合肥 230000
    3.淮南师范学院 计算机学院,安徽 淮南 232038
  • 出版日期:2025-09-01 发布日期:2025-09-01

Sarcasm Detection Model Incorporating Topic Information and Contextual Incongruity Information

DUAN Yujun, ZHANG Shunxiang, QIAN Longhai, WEN Hua, DING Yuanyuan, GE Chang   

  1. 1.School of Computer Science and Engineering, Anhui University of Science and Technology, Huainan, Anhui 232001, China
    2.Artificial Intelligence Research Institute of Hefei Comprehensive National Science Center, Hefei 230000, China
    3.School of Computer, Huainan Normal University, Huainan 232038, China
  • Online:2025-09-01 Published:2025-09-01

摘要: 现有面向话题的讽刺识别研究未同时考虑句内和上下文句间语境不协调信息对讽刺识别的影响,导致讽刺识别的准确度不高。针对这一问题,提出一种融合话题信息和语境不协调信息的讽刺识别模型。使用BERT分别获取话题文本和评论文本的词向量,通过Bi-LSTM进一步提取语义特征,最后使用交叉注意力获取融合话题信息的评论文本特征。此外,将话题文本和评论文本中的名词及名词相邻观点词构成词块,使用Word2Vec获取词块向量,再通过自注意力机制捕捉相同名词不同观点的词块之间的语境不协调信息。将融合话题信息的评论文本特征与词块间语境不协调信息进行拼接,使用Softmax获取讽刺识别结果。实验结果表明,该模型充分考虑了话题信息和语境不协调信息,提高了讽刺识别的准确率。

关键词: 讽刺识别, 面向话题的讽刺识别, 不协调信息, 注意力机制

Abstract: Existing topic-oriented sarcasm detection studies have not simultaneously considered the effects of intra-sentence and inter-contextual sentence contextual incongruity information on sarcasm detection, resulting in poor sarcasm recognition accuracy. To address this problem, a sarcasm detection model incorporating topic information and contextual incongruity information is proposed. Firstly, the word vectors of topic text and comment text are obtained separately using BERT, then the semantic features are further extracted using Bi-LSTM, and finally the comment text features fused with topic information are obtained using cross-attention. In addition, the nouns and noun-adjacent opinion words in the topic text and comment text are formed into word blocks, and Word2Vec is used to obtain the word block vectors, and then the self-attention mechanism is used to capture the contextual incongruity information between the word blocks of the same nouns with different opinions. The comment text features fused with topic information are spliced with the context incongruity information between word blocks, and then Softmax is used to obtain the sarcasm detection results. The experimental results show that the model fully considers the topic information and context incongruity information to improve the accuracy of sarcasm detection.

Key words: sarcasm detection, topic-oriented sarcasm detection, incongruity information, attention mechanism