计算机工程与应用 ›› 2025, Vol. 61 ›› Issue (9): 325-333.DOI: 10.3778/j.issn.1002-8331.2408-0036

• 工程与应用 • 上一篇    下一篇

基于大语言模型的施工安全多模态知识图谱的构建与应用

董磊,吴福居,史健勇,潘龙飞   

  1. 1.上海交通大学 船舶海洋与建筑工程学院,上海 201100
    2.中交一公局厦门工程有限公司,福建 厦门 361021
  • 出版日期:2025-05-01 发布日期:2025-04-30

Construction and Application of Multimodal Knowledge Graph in Construction Safety Field Based on Large Language Model

DONG Lei, WU Fuju, SHI Jianyong, PAN Longfei   

  1. 1.School of Ocean and Civil Engineering, Shanghai Jiao Tong University, Shanghai 201100, China
    2.China First Highway Xiamen Engineering Co., Ltd., Xiamen, Fuzhou 361021,China
  • Online:2025-05-01 Published:2025-04-30

摘要: 现有施工安全管理方法难以有效整合文本与图片多模态信息,针对施工现场安全事故的领域内知识表达和推理能力有限,并且处理和应用数据需要广泛的领域知识和专业背景。针对这一问题,提出一种基于多模态大语言模型的多模态知识图谱构建方法。基于施工安全管理的基本理论和实践经验,构建施工安全知识本体,在此基础上运用多模态大模型构建出多模态知识图谱,解决文本与图片多模态整合以及领域内知识表达和推理能力有限的问题。构建出的知识图谱不仅整合了文本中的事故安全知识,还包含了现场图片信息,提升了知识的全面性和实用性。通过计算准确率、召回率、F1值三个指标对抽取结果进行评价,均得到了很高的分数,验证了大模型对于图片抽取的合理性和准确性。在实际应用中,该方法有助于安全管理人员及时的发现施工现场的安全事故,为管理决策和智能推理提供了重要支持。

关键词: 多模态知识图谱, 大语言模型, 安全管理, 知识抽取, 本体构建

Abstract: It is difficult for existing construction safety management methods to effectively integrate multi-modal information of text and picture, and the knowledge expression and reasoning ability in the field of construction site safety accidents are limited, and the processing and application of data require a wide range of domain knowledge and professional background. To solve this problem, this paper proposes a multimodal knowledge graph construction method based on multimodal large language model. Through three steps of data collection and preprocessing, ontology construction at concept level and instance level, the multimodal knowledge graph is constructed to solve the problems of multimodal integration of text and picture and the limited knowledge expression and reasoning ability in the field. The knowledge map constructed not only integrates the accident safety knowledge in the text, but also includes the scene picture information, which improves the comprehensiveness and practicality of the knowledge. Three indexes of accuracy, recall rate and F1 value are used to evaluate the extraction results, and high scores are obtained, which verify the rationality and accuracy of the large model for image extraction. In practical application, this method is helpful for safety managers to discover safety accidents on construction site in time, and provides important support for management decision-making and intelligent reasoning.

Key words: multimodal knowledge graph, large language model, safety management, knowledge extraction, ontology construction