Survey of 3D Scene Recognition and Representation Methods of Multimodal Knowledge

doi:10.3778/j.issn.1002-8331.2302-0083

Abstract

Abstract: This paper reviews the application of multimodal knowledge mapping technology for scene recognition. The technique combines different levels of 3D expertise into deep neural networks to achieve scene awareness and knowledge representation. This paper systematically describes the technology at three levels：storage, acquisition, and induction of knowledge. The contributions of this paper are ：a comprehensive review of existing techniques for the rapid construction of 3D scene graphs with external feature databases; an in-depth discussion of deep learning methods for processing 3D point clouds and videos, and an analysis of future research directions in this field. The research is of great significance to the field of artificial intelligence and provides useful references for further research in related fields. It contributes to strengthening the integration between multimodal knowledge graphs and other AI technologies（such as natural language processing, computer vision, etc.） to achieve more intelligent, automated, and humanized applications.

Key words: scene graph, knowledge graph, neural network, multimodality

摘要： 综述了多模态知识图谱技术在场景识别方面的应用。该技术将不同层次的3D专业知识结合到深度神经网络中，实现场景认知和知识表达。从知识的存储、获取和归纳三个层面，系统阐述了该技术的相关内容。贡献在于：全面综述了外置特征数据库快速构建3D场景图的现有技术；深入探讨了处理三维点云和视频的深度学习方法，并对此领域的未来研究方向做出分析。该研究对人工智能领域具有重要意义，为相关领域的进一步研究提供了有益的参考。为加强多模态知识图谱与其他人工智能技术（如自然语言处理、计算机视觉等）之间的融合，实现更加智能化、自动化、人性化的应用做出贡献。

关键词: 场景图, 知识图谱, 神经网络, 多模态

LI Jianxin, SI Guannan, TIAN Pengxin, AN Zhaoliang, ZHOU Fengyu. Survey of 3D Scene Recognition and Representation Methods of Multimodal Knowledge[J]. Computer Engineering and Applications, 2023, 59(20): 35-50.

李建辛, 司冠南, 田鹏新, 安兆亮, 周风余. 多模态知识图谱的3D场景识别与表达方法综述[J]. 计算机工程与应用, 2023, 59(20): 35-50.

References

[1] OST J，MANNAN F，THUEREY N，et al.Neural scene graphs for dynamic scenes[C]//Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition（CVPR），Nashville，June 20-25，2021.Piscataway：IEEE，2021：2856-2865.
[2] DUAN Y C，SHAO L X，HU G Z，et al.Specifying architecture of knowledge graph with data graph，information graph，knowledge graph and wisdom graph[C]//Proceedings of the 2017 IEEE 15th International Conference on Software Engineering Research，Management and Applications（SERA），London，Jun 7-9，2017.Piscataway：IEEE，2017：327-332.
[3] KRISHNA R，ZHU Y K，GROTH O，et al.Visual genome：connecting language and vision using crowdsourced dense image annotations[J].International Journal of Computer Vision，2017，123（1）：32-73.
[4] HE Y Y，XIA N Q，LIU X S，et al.Improved locality affine-invariant feature matching[C]//Proceedings of the 2021 4th International Conference on Advanced Electronic Materials，Computers and Software Engineering（AEMCSE），Changsha，Mar 26-28，2021.Piscataway：IEEE，2021：832-836.
[5] QIN X，LIU J，WANG Y L，et al.Natural language processing was effective in assisting rapid title and abstract screening when updating systematic reviews[J].Journal of Clinical Epidemiology，2021，133：121-129.
[6] HERTZMAN R J，DESHPANDE P，LEARY S，et al.Visual genomics analysis studio as a tool to analyze multiomic data[J].Frontiers in Genetics，2021，12：642012.
[7] WANG M，WANG H F，QI G L，et al.Richpedia：a large-scale，comprehensive multi-modal knowledge graph[J].Big Data Research，2020，22（10）：100159.
[8] GU J X，ZHAO H D，ZHE L，et al.Scene graph generation with external knowledge and image reconstruction[C]//Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition（CVPR），Long Beach，Jun 15-20，2019.Piscataway：IEEE，2019：1969-1978.
[9] TIAN X，JI L，GAO H Y，et al.Scene graph generation method based on external information guidance and residual scrambling[J].Journal of Frontiers of Computer Science & Technology，2021，15（10）：1958.
[10] WICKRAMARACHCHI R，HENSON C，SHETH A.Knowledge-infused learning for entity prediction in driving scenes[J].Frontiers in Big Data，2021：98.
[11] PU N，CHEN W，LIU Y，et al.Lifelong person re-identification via adaptive knowledge accumulation[C]//Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition（CVPR），Nashville，Jun 20-25，2021.Piscataway：IEEE，2021：7901-7910.
[12] 杨旭华，王磊，叶蕾，等.基于节点相似性和网络嵌入的复杂网络社区发现算法[J].计算机科学，2022，49（3）：121-128.
YANG X H，WANG L，YE L，et al.Complex network community discovery algorithm based on node similarity and network embedding[J].Computer Science，2022，49（3）：121-128.
[13] ZHANG Z Q，CAI J Y，ZHANG Y D，et al.Learning hierarchy-aware knowledge graph embeddings for link prediction[C]//Proceedings of The Thirty-Fourth AAAI Conference on Artificial Intelligence，New York，Feb 7-12，2020.Palo Alto：AAAI，2020：3065-3072.
[14] KUMAR A，SINGH S S S，SINGH K，et al.Link prediction techniques，applications，and performance：a survey[J].Physica A：Statistical Mechanics and Its Applications，2020，553：7897-7906.
[15] SHEN J W，SHI K F，MA M G.Exploring the construction and application of spatial scene knowledge graphs considering topological relations[J].Transactions in GIS，2022，26（3）：1531-1547.
[16] WANG R Z，TANG D Y，DUAN N，et al.K-adapter：infusing knowledge into pre-trained models with adapters[J].arXiv：2002.01808，2020.
[17] YANG A，WANG Q，LIU J，et al.Enhancing pre-trained language representations with rich knowledge for machine reading comprehension[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics，Florence，Jul 28-Aug 2，2019.Stroudsburg：ACL，2019：2346-2357.
[18] DEVLIN J，CHANG M W，LEE K，et al.BERT：pre-training of deep bidirectional transformers for language understanding[J].arXiv：1810.04805，2018.
[19] YADATI N，DAYANIDHI R S，VAISHNAVI S，et al.Knowledge base question answering through recursive hypergraphs[C]//Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics，Apr 19-23，2021.Stroudsburg：ACL，2021：448-454.
[20] PETERS M E，NEUMANN M，LOGAN IV R L，et al.Knowledge enhanced contextual word representations[J].arXiv：1909.04164，2019.
[21] SUCHANEK F M，KASNECI G，WEIKUM G.Yago：a core of semantic knowledge[C]//Proceedings of the 16th International Conference on World Wide Web，Banff Alberta，May 8-12，2007.New York：ACM，2007：697-706.
[22] BOLLACKER K，EVANS C，PARITOSH P，et al.Freebase：a collaboratively created graph database for structuring human knowledge[C]//Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data，Vancouver，Jun 9-12，2008.New York：ACM，2008：1247-1250.
[23] FORMICA A，TAGLINO F.Semantic relatedness in DBpedia：a comparative and experimental assessment[J].Information Sciences，2023，621：474-505.
[24] CARLSON A，BETTERIDGE J，KISIEL B，et al.Toward an architecture for never-ending language learning[C]//Proceedings of Twenty-Fourth AAAI Conference on Artificial Intelligence，Atlanta，Jul 11-15，2010.Palo Alto：AAAI，2010：1306-1313.
[25] NAYYERI M，VAHDATI S，LEHMANN J，et al.Soft marginal transe for scholarly knowledge graph completion[J].arXiv：1904.12211，2019.
[26] WANG Z，ZHANG J W，FENG J L，et al.Knowledge graph embedding by translating on hyperplanes[C]//Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence，Québec，Jul 27-31，2014.Palo Alto：AAAI，2014.
[27] HUANG W H，LI G，JIN Z.Improved knowledge base completion by the path-augmented TransR model[C]//Proceedings of the Knowledge Science，Engineering and Management：10th International Conference，KSEM 2017，Melbourne，Aug 19-20，2017.Berlin，Heidelberg：Springer，2017：149-159.
[28] ROSSO P，YANG D Q，CUDRé-MAUROUX P.Beyond triplets：hyper-relational knowledge graph embedding for link prediction[C]//Proceedings of the Web Conference 2020，Taipei，Apr 20-24，2020.New York：ACM，2020：1885-1896.
[29] WANG B Y，ZHAO D H，LIOMA C，et al.Encoding word order in complex embeddings[J].arXiv：1912.12333，2019.
[30] BORDES A，WESTON J，COLLOBERT R，et al.Learning structured embeddings of knowledge bases[C]//Proceedings of the Twenty-Fifth AAAI Conference on Artificial Intelligence，San Francisco，Aug 7-11，2011.Palo Alto：AAAI，2011：301-306.
[31] BORDES A，GLOROT X，WESTON J，et al.A semantic matching energy function for learning with multi-relational data：application to word-sense disambiguation[J].Machine Learning，2014，94：233-259.
[32] JI K X，HUI B，LUO G C.Graph attention networks with local structure awareness for knowledge graph completion[J].IEEE Access，2020，8：224860-224870.
[33] TAO Y，LI Y，WU Z H.Temporal link prediction via reinforcement learning[C]//2021 IEEE International Conference on Acoustics，Speech and Signal Processing（ICASSP），Toronto，Jun 6-11，2021.Piscataway：IEEE，2021：3470-3474.
[34] MOON C，JONES P，SAMATOVA N F.Learning entity type embeddings for knowledge graph completion[C]//Proceedings of the 2017 ACM on Conference on Information and Knowledge Management，Singapore，Nov 6-10，2017.New York：ACM，2017：2215-2218.
[35] CAI B，XIANG Y，GAO L X，et al.From wide to deep：dimension lifting network for parameter-efficient knowledge graph embedding[J].arXiv：2303.12816，2023.
[36] DAS P，KARNAM S K，PANDA A，et al.Diversity matters：robustness of bias measurements in Wikidata[J].arXiv：2302.14027，2023.
[37] AL-MOSLMI T，OCA?A M G，OPDAHL A L，et al.Named entity extraction for knowledge graphs：a literature overview[J].IEEE Access，2020，8：32862-32881.
[38] SHANG C，TANG Y，HUANG J，et al.End-to-end structure-aware convolutional networks for knowledge base completion[C]//Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence，Honolulu，Jan 27-Feb 1，2019.Palo Alto：AAAI，2019：3060-3067.
[39] 邬焜，邬天启.人类知识探索中的上帝情怀[J].系统科学学报，2022，30（4）：1-7.
WU K，WU T Q.The God complex in the search for human knowledge[J].Journal of Systems Science，2022，30（4）：1-7.
[40] WANG Q，MAO Z D，WANG B，et al.Knowledge graph embedding：a survey of approaches and applications[J].IEEE Transactions on Knowledge and Data Engineering，2017，29（12）：2724-2743.
[41] BALA?EVI? I，ALLEN C，HOSPEDALES T M.Tucker：tensor factorization for knowledge graph completion[J].arXiv：1901.09590，2019.
[42] DETTMERS T，MINERVINI P，STENETORP P，et al.Convolutional 2D knowledge graph embeddings[C]//Proceedings of the AAAI Conference on Artificial Intelligence，New Orleans，Feb 2-7，2018.Palo Alto：AAAI，2018.
[43] NGUYEN D Q，VU T，NGUYEN T D，et al.A capsule network-based embedding model for knowledge graph completion and search personalization[J].arXiv：1808.04122，
2018.
[44] VASHISHTH S，SANYAL S，NITIN V，et al.Interacte：improving convolution-based knowledge graph embeddings by increasing feature interactions[C]//Proceedings of the AAAI Conference on Artificial Intelligence，New York，Feb 7-12，2020.Palo Alto：AAAI，2020：3009-3016.
[45] SUN Z Q，DENG Z H，NIE J Y，et al.Rotate：knowledge graph embedding by relational rotation in complex space[J].arXiv：1902.10197，2019.
[46] SCHLICHTKRULL M，KIPF T N，BLOEM P，et al.Modeling relational data with graph convolutional networks[C]//Proceedings of the Semantic Web：15th International Conference，ESWC 2018，Heraklion，Crete，Greece，Jun 3-7，2018.Berlin，Heidelberg：Springer，2018：593-607.
[47] VASHISHTH S，SANYAL S，NITIN V，et al.Composition-based multi-relational graph convolutional networks[J].arXiv：1911.03082，2019.
[48] KIM A，O?EP A，LEAL-TAIXé L.Eagermot：3D multi-object tracking via sensor fusion[C]//Proceedings of the 2021 IEEE International Conference on Robotics and Automation（ICRA），Xi’an，May 30-June 5，2021.Piscataway：IEEE，2021：11315-11321.
[49] ARNOLD E，AL-JARRAH O Y，DIANATI M，et al.A survey on 3D object detection methods for autonomous driving applications[J].IEEE Transactions on Intelligent Transportation Systems，2019，20（10）：3782-3795.
[50] XU Q G，SUN X D，WU C Y，et al.Grid-GCN for fast and scalable point cloud learning[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition（CVPR），Seattle，June 13-19，2020.Piscataway：IEEE，2020：5661-5670.
[51] GUO Y L，WANG H Y，HU Q Y，et al.Deep learning for 3D point clouds：a survey[J].IEEE Transactions on Pattern Analysis and Machine Intelligence，2021，43（12）：4338-4364.
[52] ZUO X X，MERRILL N，LI W，et al.CodeVIO：visual-inertial odometry with learned optimizable dense depth[C]//Proceedings of the 2021 IEEE International Conference on Robotics and Automation（ICRA），Xi’an，May 30-June 5，2021.Piscataway：IEEE，2021：14382-14388.
[53] YU Q，YANG C Z，FAN H H，et al.Latent-MVCNN：3D shape recognition using multiple views from pre-defined or random viewpoints[J].Neural Processing Letters，2020，52：581-602.
[54] PAN Z Z，ZHUANG B H，LIU J，et al.Scalable vision transformers with hierarchical pooling[C]//Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision（ICCV），Montreal，Oct 10-17，2021：377-386.
[55] MOSTAFAEI H，MIRI S M，SCHMID S.ReactNet：self-adjusting architecture for networked systems[C]//Proceedings of the 17th International Conference on Emerging Networking EXperiments and Technologies，Virtual Event Germany，Dec 7-10，2021.New York，ACM，2021：473-474.
[56] FRANKLE J，CARBIN M.The lottery ticket hypothesis：finding sparse，trainable neural networks[J].arXiv：1803.
03635，2018.
[57] WU Z H，PAN S R，CHEN F W，et al.A comprehensive survey on graph neural networks[J].arXiv：1901.00596，2019.
[58] RODRIGUEZ D，BEHNKE S.DeepWalk：omnidirectional bipedal gait by deep reinforcement learning[C]//Proceedings of the 2021 IEEE International Conference on Robotics and Automation（ICRA），Xi’an，May 30-June 05，2021.Piscataway：IEEE，2021：3033-3039.
[59] GROHE M.word2vec，node2vec，graph2vec，x2vec：towards a theory of vector embeddings of structured data[C]//Proceedings of the 39th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems，Portland，June 14-19，2020.New York：ACM，2020：1-16.
[60] WEI X，YU R X，SUN J.View-GCN：view-based graph convolutional network for 3d shape analysis[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition（CVPR），Seattle，June 13-19，2020.Piscataway：IEEE，2020：1850-1859.
[61] WU H，LIU Q，LIU X D.A review on deep learning approaches to image classification and object segmentation[J].Computers，Materials and Continua，2019，58（2）：575-597.
[62] ZELLER N，QUINT F，STILLA U.Scale-awareness of light field camera based visual odometry[C]//Proceedings of the European Conference on Computer Vision（ECCV），Munich，Sep 8-14，2018.Berlin，Heidelberg：Springer，2018：715-730.
[63] KULKARNI S C，REGE P P.Pixel level fusion techniques for SAR and optical images：a review-ScienceDirect[J].Information Fusion，2020，59：13-29.
[64] LIU Y，CHEN X，WANG Z F，et al.Deep learning for pixel-level image fusion：recent advances and future prospects[J].Information Fusion，2018，42：158-173.
[65] ABUALIGAH L，DIABAT A，SUMARI P，et al.A novel evolutionary arithmetic optimization algorithm for multilevel thresholding segmentation of covid-19 CT images[J].Processes，2021，9（7）：1155.
[66] ARASLANOV N，ROTH S.Single-stage semantic segmentation from image labels[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition（CVPR），Seattle，June 13-19，2020.Piscataway：IEEE，2020：4253-4262.
[67] MITTAL M，VERMA A，KAUR I，et al.An efficient edge detection approach to provide better edge connectivity for image analysis[J].IEEE Access，2019，7：33240-33255.
[68] MIN E，GUO X F，LIU Q，et al.A survey of clustering with deep learning：from the perspective of network architecture[J].IEEE Access，2018，6：39501-39514.
[69] ALJALBOUT E，GOLKOV V，SIDDIQUI Y，et al.Clustering with deep learning：taxonomy and new methods[J].arXiv：1801.07648，2018.
[70] WILLEMINK M J，KOSZEK W A，HARDELL C，et al.Preparing medical imaging data for machine learning[J].Radiology，2020，295（1）：4-15.
[71] CHAVE J，DAVIES S J，PHILLIPS O L，et al.Ground data are essential for biomass remote sensing missions[J].Surveys in Geophysics，2019，40：863-880.
[72] CHONG Y W，NIE C C，TAO Y L，et al.HCNet：hierarchical context network for semantic segmentation[J].IEEE Access，2020，8：179213-179223.
[73] XU L，JING W P，SONG H B，et al.High-resolution remote sensing image change detection combined with pixel-level and object-level[J].IEEE Access，2019，7：78909-78918.
[74] ZHAO Z Q，ZHENG P，XU S T，et al.Object detection with deep learning：a review[J].IEEE Transactions on Neural Networks and Learning Systems，2019，30（11）：3212-3232.
[75] DOLZ J，GOPINATH K，YUAN J，et al.HyperDense-Net：a hyper-densely connected CNN for multi-modal image segmentation[J].IEEE Transactions on Medical Imaging，2018，38（5）：1116-1126.
[76] WEN C，ZHANG Y D，LI Z W，et al.Pixel2mesh++：multi-view 3D mesh generation via deformation[C]//Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision（ICCV），Seoul，Oct 27-Nov 2，2019.New York：IEEE Communications Society，2019：1042-1051.
[77] KATO H，USHIKU Y，HARADA T.Neural 3D mesh renderer[C]//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition，Salt Lake，June 18-23，2018.Piscataway：IEEE，2018：3907-3916.
[78] 翟正利，梁振明，周炜，等.变分自编码器模型综述[J].计算机工程与应用，2019，55（3）：1-9.
ZHAI Z L，LIANG Z M，ZHOU W，et al.Research overview of variational auto-encoders models[J].Computer Engineering and Applications，2019，55（3）：1-9.
[79] YI X，WALIA E，BABYN P.Generative adversarial network in medical imaging：a review[J].Medical Image Analysis，2019，58：101552.
[80] 王昌硕，王含，宁欣，等.基于局部区域动态覆盖的3D点云分类方法[J].软件学报，2023，34（4）：1962-1976.
WANG C S，WANG H，NING X，et al.3D point cloud classification method based on dynamic coverage of local area[J].Journal of Software，2023，34（4）：1962-1976.
[81] HAN S C，LIU B B，CABEZAS R，et al.MEgATrack：monochrome egocentric articulated hand-tracking for virtual reality[J].ACM Transactions on Graphics（ToG），2020，39（4）：1-13.
[82] 南文倩，郭斌，陈荟慧，等.基于跨空间多元交互的群智感知动态激励模型[J].计算机学报，2015，38（12）：2412-2425.
NAN W Q，GUO B，CHEN H H，et al.A dynamic incentive model for group wisdom perception based on cross-space multivariate interaction[J].Chinese Journal of Computers，2015，38（12）：2412-2425.
[83] 刘迪，贾金露，赵玉卿，等.基于深度学习的图像去噪方法研究综述[J].计算机工程与应用，2021，57（7）：1-13.
LIU D，JIA J L，ZHAO Y Q，et al.Overview of image denoising methods based on deep learning[J].Computer Engineering and Applications，2021，57（7）：1-13.
[84] JARITZ M，GU J Y，SU H.Multi-view pointnet for 3D scene understanding[C]//Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision（ICCV），Seoul，Oct 27-28，2019.New York：IEEE Communications Society，2019：3995-4002.
[85] QI C R，SU H，MO K C，et al.Pointnet：deep learning on point sets for 3D classification and segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition（CVPR），Honolulu，July 21-26，2017.Piscataway：IEEE，2017：652-660.
[86] QI C R，YI L，SU H，et al.Pointnet++：deep hierarchical feature learning on point sets in a metric space[C]//Advances in Neural Information Processing Systems，2017.
[87] KU J，MOZIFIAN M，LEE J，et al.Joint 3D proposal generation and object detection from view aggregation[C]//Proceedings of the 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems（IROS），Madrid，Oct 1-5，2018，Piscataway：IEEE，2018：1-8.
[88] LIANG M，YANG B，CHEN Y，et al.Multi-task multi-sensor fusion for 3D object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition（CVPR），Seoul，Oct 27-Nov 2，2019.New York：IEEE Communications Society，2019：7345-7353.
[89] QI L，KUEN J，WANG Y，et al.Open world entity segmentation[J].IEEE Transactions on Pattern Analysis and Machine Intelligence，2023，45（7）：8743-8756.
[90] WANG W Y，FEISZLI M，WANG H，et al.Unidentified video objects：a benchmark for dense，open-world segmentation[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision（ICCV），Montreal，Oct 11-17，2021.New York：IEEE Communications Society，2021：10776-10785.
[91] BEAR D，FAN C，MROWCA D，et al.Learning physical graph representations from visual scenes[C]//Advances in Neural Information Processing Systems，2020：6027-6039.
[92] RADFAR M，BARNWAL R，SWAMINATHAN R V，et al.ConvRNN-T：convolutional augmented recurrent neural network transducers for streaming speech recognition[J].arXiv：2209.14868，2022.
[93] KONG X，YANG X M，ZHAI G Y，et al.Semantic graph based place recognition for 3d point clouds[C]//Proceedings of the 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems（IROS），Las Vegas，October 25-29，2020：8216-8223.
[94] LIU Z，SUO C Z，ZHOU S B，et al.SeqLPD：sequence matching enhanced loop-closure detection based on large-scale point cloud description for self-driving vehicles[C]//Proceedings of the 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems（IROS），Macao，Nov 3-8，2019.Piscataway：IEEE，2019：1218-1223.
[95] LIU Z，ZHOU S B，SUO C Z，et al.LPD-Net：3D point cloud learning for large-scale place recognition and environment analysis[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision（ICCV），Seoul，Oct 27-Nov 2，2019.New York：IEEE Communications Society，2019：2831-2840.
[96] XIAO H，CHEN Y D，SHI X D.Knowledge graph embedding based on multi-view clustering framework[J].IEEE Transactions on Knowledge and Data Engineering，2019，33（2）：585-596.
[97] CHENG G，XIE X X，HAN J W，et al.Remote sensing image scene classification meets deep learning：challenges，methods，benchmarks，and opportunities[J].IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing，2020，13：3735-3756.
[98] PAPARRIZOS J，EDIAN I，LIU C W，et al.Fast adaptive similarity search through variance-aware quantization[C]//Proceedings of the 2022 IEEE 38th International Conference on Data Engineering，Lumpur，May 9-12，2022.Piscataway：IEEE 2022：2969-2983.
[99] BRAUN J J，MEYER P M，MEYER D R.Sparing of a brightness habit in rats following visual decortication[J].Journal of Comparative and Physiological Psychology，1966，61（1）：79.
[100] ADJALI O，BESAN?ON R，FERRET O，et al.Multimodal entity linking for tweets[C]//Proceedings of the Advances in Information Retrieval：42nd European Conference on IR Research，Lisbon，April 14-17，2020.Berlin，Heidelberg：Springer，2020：463-478.
[101] ZHAO W T，HU Y，WANG H D，et al.Boosting entity-aware image captioning with multi-modal knowledge graph[J].arXiv：2107.11970，2021.
[102] XING Y R，SHI Z，MENG Z，et al.KM-BART：knowledge enhanced multimodal BART for visual commonsense generation[J].arXiv：2101.00419，2021.
[103] LONG Y H，WU J Y，LU B，et al.Relational graph learning on visual and kinematics embeddings for accurate gesture recognition in robotic surgery[C]//Proceedings of the 2021 IEEE International Conference on Robotics and Automation（ICRA），Xi’an，May 30-June 5，2021.Piscataway：IEEE，2021：13346-13353.
[104] KAN X，CUI H J，YANG C.Zero-shot scene graph relation prediction through commonsense knowledge integration[C]//Proceedings of the Machine Learning and Knowledge Discovery in Databases，Bilbao，Spain，Sep 13-17，2021，Berlin：Springer，2021：466-482.
[105] HONG Y C，RODRIGUEZ C，QI Y K，et al.Language and visual entity relationship graph for agent navigation[C]//Advances in Neural Information Processing Systems，2020：7685-7696.