计算机工程与应用 ›› 2023, Vol. 59 ›› Issue (20): 35-50.DOI: 10.3778/j.issn.1002-8331.2302-0083

• 热点与综述 • 上一篇    下一篇

多模态知识图谱的3D场景识别与表达方法综述

李建辛,司冠南,田鹏新,安兆亮,周风余   

  1. 1.山东交通学院 信息科学与电气工程学院,济南 250357
    2.山东大学 控制科学与工程学院,济南 250000
  • 出版日期:2023-10-15 发布日期:2023-10-15

Survey of 3D Scene Recognition and Representation Methods of Multimodal Knowledge

LI Jianxin, SI Guannan, TIAN Pengxin, AN Zhaoliang, ZHOU Fengyu   

  1. 1.School of Information Science and Electrical Engineering, Shandong Jiaotong University, Jinan 250357, China
    2.College of Control Science and Engineering, Shandong University, Jinan 250000, China
  • Online:2023-10-15 Published:2023-10-15

摘要: 综述了多模态知识图谱技术在场景识别方面的应用。该技术将不同层次的3D专业知识结合到深度神经网络中,实现场景认知和知识表达。从知识的存储、获取和归纳三个层面,系统阐述了该技术的相关内容。贡献在于:全面综述了外置特征数据库快速构建3D场景图的现有技术;深入探讨了处理三维点云和视频的深度学习方法,并对此领域的未来研究方向做出分析。该研究对人工智能领域具有重要意义,为相关领域的进一步研究提供了有益的参考。为加强多模态知识图谱与其他人工智能技术(如自然语言处理、计算机视觉等)之间的融合,实现更加智能化、自动化、人性化的应用做出贡献。

关键词: 场景图, 知识图谱, 神经网络, 多模态

Abstract: This paper reviews the application of multimodal knowledge mapping technology for scene recognition. The technique combines different levels of 3D expertise into deep neural networks to achieve scene awareness and knowledge representation. This paper systematically describes the technology at three levels:storage, acquisition, and induction of knowledge. The contributions of this paper are :a comprehensive review of existing techniques for the rapid construction of 3D scene graphs with external feature databases; an in-depth discussion of deep learning methods for processing 3D point clouds and videos, and an analysis of future research directions in this field. The research is of great significance to the field of artificial intelligence and provides useful references for further research in related fields. It contributes to strengthening the integration between multimodal knowledge graphs and other AI technologies(such as natural language processing, computer vision, etc.) to achieve more intelligent, automated, and humanized applications.

Key words: scene graph, knowledge graph, neural network, multimodality