计算机工程与应用 ›› 2020, Vol. 56 ›› Issue (4): 140-145.DOI: 10.3778/j.issn.1002-8331.1811-0020

• 模式识别与人工智能 • 上一篇    下一篇

基于卷积神经网络的中文景点识别研究

刘小安,彭涛   

  1. 1.北京联合大学 智慧城市学院,北京 100101
    2.北京联合大学 机器人学院,北京 100101
  • 出版日期:2020-02-15 发布日期:2020-03-06

Research on Chinese Scenic Spot Named Entity Recognition Based on Convolutional Neural Network

LIU Xiaoan, PENG Tao   

  1. 1.Smart City College, Beijing Union University, Beijing 100101, China
    2.College of Robotics, Beijing Union University, Beijing 100101, China
  • Online:2020-02-15 Published:2020-03-06

摘要:

命名实体识别是自然语言处理任务的重要环节。近年来,基于深度学习的通用命名实体识别模型取得显著效果。而在旅游领域,中文旅游景点实体识别主要依赖于特征工程的方法。提出一种基于CNN-BiLSTM-CRF的网络模型,该模型不使用任何人工特征,通过神经网络充分对文本的局部信息特征进行抽象化抽取和表示,并学习和利用文本的上下文信息,实现对景点实体的识别。实验结果显示,该方法能够有效识别中文旅游景点实体,并在实验中取得[F1]值93.9%的效果。

关键词: 中文命名实体识别, 深度学习, 景点识别, 卷积神经网络(CNN), 双向长短记忆网络(BiLSTM), 条件随机场(CRF)

Abstract:

Named entity recognition is one of important stages in natural language processing. In recent years, the model based on deep learning has achieved remarkable results on open domain named entity recognition. However in tourism domain, Chinese attractions entity recognition methods often rely on feature engineering. A neural network model based on CNN-BiLSTM-CRF is proposed. The model doesn’t add any artificial features, extracts and expresses the local information features of the text by neural network, and learns and utilizes the context information of the text to recognition the scenic spots entities. The experimental results show that the method can effectively identify Chinese tourist attractions entities, and the [F1] value is 93.9% in the experiment.

Key words: Chinese named entity recognition, deep learning, scenic spot recognition, Convolutional Neural Network(CNN), Bidirectional Long Short Term Memory(BiLSTM), Conditional Random Field(CRF)