Computer Engineering and Applications ›› 2024, Vol. 60 ›› Issue (3): 213-219.DOI: 10.3778/j.issn.1002-8331.2303-0045

• Pattern Recognition and Artificial Intelligence • Previous Articles     Next Articles

Cross-Model Named Entity Recognition in Pictures for Procurement Documents

YANG Sai, LIU Xin, YU Shaowen   

  1. 1.Xinfangsheng Digital Intelligence Technology Co., Ltd., Beijing 102600, China
    2.Business School, University of Edinburgh, Edinburgh  EH8 9JS, UK
    3.College of Computer Science and Technology, China University of Petroleum (East China), Qingdao, Shandong 266580, China
  • Online:2024-02-01 Published:2024-02-01

面向采购文件的跨模态图片文本命名实体识别

杨赛,刘昕,于绍文   

  1. 1.鑫方盛数智科技股份有限公司,北京 102600
    2.爱丁堡大学 商学院,英国 爱丁堡 EH8 9JS
    3.中国石油大学(华东) 计算机科学与技术学院,山东 青岛 266580

Abstract: The digital and intelligent procurement of smart supply chain can improve the efficiency of procurement and save a lot of labor costs. The procurement documents include a large number of documents such as certificates and qualifications. In view of uneven text layout and unclear scanned images, this paper designs an end-to-end cross-modal named entity recognition model O2V2BLC (OCR-Vector-Bi-LSTM-CRF) based on deep learning to detect named entities from the image. This model defines the continuous text character boundary for the characters recognized by optical character recognition technology, maps each character within the boundary into a vector, designs a bi-directional short and long term memory network (Bi-LSTM) to capture the context semantics of the character sequence within the boundary, calculates the state matrix of character, and obtains the global optimal marker sequence by constraining the character marker sequence rules by conditional random fields. The prediction error of named entities is calculated according to the training set, and the parameters of O2V2BLC model are dynamically optimized. Applying the method in this paper to images such as the qualification type of the procurement document can effectively identify the bidding unit, expert name, professional name and other named entities in the images. Compared with the conditional random field, hidden Markov algorithm and Bert-Bi-LSTM-CRF model, it improves the accuracy of entity identification and provides support for the digital and intelligent procurement of the intelligent supply chain.

Key words: smart supply chain, named entity recognition, optical character recognition, bi-directional , long and short-term memory (Bi-LSTM), conditional random fields

摘要: 智慧供应链的数智化采购环节能够提高采购工作效率,节省大量人力成本。采购文件中包括大量证照资质等文件,针对其中图片文本中文字排版参差不齐、扫描图像不清晰等问题,设计了基于深度学习的端到端跨模态命名实体识别模型O2V2BLC(OCR-Vector-Bi-LSTM-CRF),从图片文本中识别命名实体。该模型针对光学字符识别技术识别出的图片文本字符,定义连续文本字符边界,将边界内每个文本字符映射为向量,设计双向长短期记忆(Bi-LSTM)网络捕获边界内字符序列的上下文语义,计算字符状态分数矩阵,并通过条件随机场约束字符标记序列规则,获得全局最优标记序列。针对训练集计算命名实体预测误差,动态优化O2V2BLC模型的参数,实现命名实体识别。将该方法应用于采购文件资质类型等图片文本数据,能够有效识别图片中的投标单位、专家姓名、专业名称等命名实体,与条件随机场、隐马尔可夫算法、Bert-Bi-LSTM-CRF模型进行对比,显著提高了实体识别准确率,为智慧供应链的数智化采购提供支持。

关键词: 智慧供应链, 命名实体识别, 光学字符识别, 双向长短期记忆网络, 条件随机场