计算机工程与应用 ›› 2024, Vol. 60 ›› Issue (5): 130-138.DOI: 10.3778/j.issn.1002-8331.2210-0271

• 模式识别与人工智能 • 上一篇    下一篇

联合多意图识别与语义槽填充的双向交互模型

李实,孙镇鹏   

  1. 东北林业大学 信息与计算机工程学院,哈尔滨 150040
  • 出版日期:2024-03-01 发布日期:2024-03-01

Bidirectional Interaction Model for Joint Multiple Intent Detection and Slot Filling

LI Shi, SUN Zhenpeng   

  1. College of Information and Computer Engineering, Northeast Forestry University, Harbin 150040, China
  • Online:2024-03-01 Published:2024-03-01

摘要: 意图识别与语义槽填充是口语理解的两个主要任务,两者具有高度相关性,通常进行联合训练。随着口语理解任务的深入,研究发现用户在现实场景中的话语往往含有多个意图。但部分联合模型只能识别用户话语中的单个意图,未能充分建模多个意图和语义槽之间的关联性。考虑到话语中多个意图的信息可以引导语义槽填充,语义槽信息也可以帮助意图更好的识别,模型采用图注意力网络建立意图和语义槽之间的双向交互。具体的,将两个任务双向关联以便模型能够挖掘多个意图与语义槽之间的关系,同时引入两个任务的标签信息使模型能够学习到话语上下文和标签的关系,从而提高意图识别与语义槽填充的准确率,优化口语理解的整体性能。实验表明,模型在MixATIS和MixSNIPS两个多意图数据集上对比其他模型性能得到了显著提升。

关键词: 口语理解, 多意图识别, 语义槽填充, 联合模型

Abstract: Intent detection and slot filling are the two major tasks of spoken language understanding, which are highly correlated and are usually trained jointly. As the spoken language understanding task progresses, it has been found that users’ utterances in real-life scenarios often contain multiple intents. However, some joint models can only detect a single intent in user utterances and fail to adequately model the correlation between multiple intents and slots. Since the information of multiple intents in the utterance can guide the slot filling and the slot information can also help the better detection of intents. The Label Bi-Interaction model uses the graph attention network to establish a two-way interaction between intents and slots. Specifically, Label Bi-Interaction model associates two tasks bidirectionally so that the model can explore the relationship between multiple intents and slots, and introduces the label information of the two tasks to enable the model to learn the relationship between utterance context and labels. This improves the accuracy of intent detection and slot filling and optimizes the overall performance of spoken language understanding. Experiments show that the performance of the model on the MixATIS and MixSNIPS two multi-intent datasets has been significantly improved compared to other models.

Key words: spoken language understanding, multi-intent detection, slot filling, joint model