计算机工程与应用 ›› 2024, Vol. 60 ›› Issue (12): 357-365.DOI: 10.3778/j.issn.1002-8331.2303-0467

• 工程与应用 • 上一篇    下一篇

基于语义感知的工业制造领域知识抽取方法

黄子麒,胡建鹏   

  1. 上海工程技术大学 电子电气工程学院,上海 201620
  • 出版日期:2024-06-15 发布日期:2024-06-14

Semantic-Aware Approach to Knowledge Extraction from Industrial Manufacturing Domains

HUANG Ziqi, HU Jianpeng   

  1. School of Electric and Electronic Engineering, Shanghai University of Engineering Science, Shanghai 201620, China
  • Online:2024-06-15 Published:2024-06-14

摘要: 工业制造领域通用知识抽取方法研究对于实现工业知识库自动化构建意义重大。针对工业本体定义需要大量人工成本和专家经验作指导的问题,基于义原分析设计了一种半自动本体构建方法,最后以汽车生产制造故障维修数据为例,完成了本体、本体类、层级和关系定义。为解决工业领域关系抽取存在的关系嵌套和级联模型误差传播问题,设计了一种基于语义感知的关系抽取模型:在该模型的潜在关系挖掘、主语抽取、宾语抽取三个环节中,基于阅读理解方法拼接不同的引导问句,得到适用于不同环节的句子编码;为利用主语先验知识,在宾语抽取模块融入注意力机制,提高了该模块编码的表达能力;三环节联合优化训练提升抽取效果。在汽车生产制造故障维修数据集、汽车工业故障模式抽取评测数据集、装备制造数据集中进行实验,提出的模型比其他关系抽取基线模型取得了更好的效果。

关键词: 本体构建, 义原分析, 关系抽取, 注意力机制, 工业制造领域

Abstract: The study of generic knowledge extraction methods in industrial manufacturing is of great significance for realizing the automated construction of industrial knowledge bases. Aiming at the problem that industrial ontology definition requires a large amount of labour cost and expert experience for guidance, a semi-automatic ontology construction method is designed based on the sememe analysis, and finally the definition of ontology, ontology class, cascade and relationship is completed, using automobile manufacturing fault repair data as an example. Subsequently, in order to solve the problems of relationship nesting and cascade model error propagation that exist in relationship extraction in the industrial domain, a semantic-aware relationship extraction model is designed. In the three links of potential relationship mining, subject extraction and object extraction in this model, different leading interrogatives are spliced based on reading comprehension methods to obtain sentence codes applicable to different links.In order to make use of the a priori knowledge of the subject, the attention mechanism is incorporated into the object extraction module to improve the expressiveness of the codes in this module. Three sessions of joint optimised training to improve extraction results. Experiments are conducted on the automotive manufacturing fault repair dataset, the automotive industry fault pattern extraction evaluation dataset, and the equipment manufacturing dataset, and the model in this paper achieves better results than other relational extraction baseline models.

Key words: ontology construction, sememe analysis, relation extraction, attention mechanism, industrial manufacturing domain