计算机工程与应用 ›› 2022, Vol. 58 ›› Issue (13): 164-170.DOI: 10.3778/j.issn.1002-8331.2012-0301

• 模式识别与人工智能 • 上一篇    下一篇

基于RoBERTa的社交媒体会话中的讽刺检测模型

魏鹏飞,曾碧,廖文雄   

  1. 广东工业大学 计算机学院,广州 510000
  • 出版日期:2022-07-01 发布日期:2022-07-01

RoBERTa-Based Sarcasm Detection Model in Conversation Threads from Social Media

WEI Pengfei,  ZENG Bi,  LIAO Wenxiong   

  1. School of Computer Science and Technology, Guangdong University of Technology, Guangzhou 510000, China
  • Online:2022-07-01 Published:2022-07-01

摘要: 讽刺是社会媒体中常用的修辞手法,在一些社交平台上(Twitter、Reddit)普遍存在,用比喻、夸张等方式对人或事进行否定、批评或嘲笑。讽刺检测任务对于理解人们实际的情感和信念至关重要。针对拥有会话上下文的目标文本进行讽刺检测,提出了一种基于RoBERTa的社交媒体会话中的讽刺检测模型。该模型主要由两个部分构成:前者是特征抽取层,采用更加鲁棒的迁移学习模型RoBERTa,对会话上下文文本和目标文本分别进行特征的学习;后者是特征融合层,由于目标文本是对会话上下文的回复,考虑到单纯的级联并不能很好地学习两者之间的对话关系,采用了改进版的attention-over-attention(AOA)注意力模型,使得目标文本可以关注到会话上下文中重要的信息。在公开的Twitter和Reddit两个数据集上进行了实验,验证了模型的有效性,还分析了对于目标文本的讽刺检测,有无会话上下文以及会话上下文数量的多少对模型性能的影响。

关键词: 自然语言处理, 深度学习, 讽刺检测, 迁移学习, 注意力机制, 一维卷积, 二分类

Abstract: Sarcasm is a commonly used rhetorical technique in social media, which is wide spread on some social platforms(Twitter, Reddit), using metaphors and exaggerations to negate, criticize or laugh at people or things. The task of sarcasm detection is essential for understanding people’s actual emotions and beliefs. For sarcasm detection of target text with conversational context, this paper proposes a RoBERTa-based sarcasm detection model in conversation threads from social media. This model is mainly composed of two parts:the former is a feature extraction layer, which uses a more robust transfer learning model RoBERTa to learn features of the conversational context and target text respectively; the latter is a feature fusion layer, since the target text is a response to the context of the conversation, considering that the simple concatenation cannot learn the dialogue relationship between the two very well, this model introduces an improved version of attention-over-attention(AOA) attention module, so that the target text can pay attention to the important information in the conversation context. Finally, experiments are conducted to verify the effectiveness of the model on the public Twitter and Reddit datasets. For the sarcasm detection of the target text, it also analyzes the influence of whether there is a conversation context or not and the number of conversation context utterances on the performance of the model.

Key words: natural language understanding, deep learning, sarcasm detection, transfer learning, attention mechanism, one-dimensional convolution, binary classification