End-to-End Triple Extraction Incorporated Formal Concept of Relation

doi:10.3778/j.issn.1002-8331.2201-0418

Abstract

Abstract: Triple extraction is the basic work of knowledge learning and knowledge graph construction. Aiming at the problems of entity recognition and relation extraction in current task models such as weak semantic association, entity nesting, relation overlap, and not much attention to existing concept knowledge, combining formal concept and neural network model, an end-to-end triple extraction method based on formal concept of relation is proposed. The model puts forward formal concept label of relation to unify the semantic expression of the entity and the relation, and converts the entity recognition problem into the concept label learning problem. The entity is input into the relational formal concept attention model, and the attention mechanism tries to capture the main relation. The connected connotation feature of the object concept, that is, the comprehensive features of the subject and object corresponding to each relation label and their context-dependent predicates are obtained through. The multiple relation labels of each pair of subject and object are output through multiple relation classifiers to realize the concept-based connectivity multi-relation extraction. In addition, the model can also introduce the extension and connotation of existing formal concept to improve the dependence on corpus tags and the tagging difficulties of model caused by entity nesting. The results on two datasets prove that the proposed model has practical effects on knowledge extraction, and can improve the problems of entity embedding and relation overlap.

Key words: triple extraction, formal concept, attention, relation overlap

摘要： 三元组抽取是知识学习、图谱构建的基础性工作。针对当前任务模型多存在实体识别与关系抽取语义关联不强、实体嵌套、关系重叠，以及既有概念知识关注不多等问题，融合形式化概念与神经网络模型，提出一种基于关系形式化概念的端到端三元组抽取方法。提出关系形式化概念标签，以统一实体与关系的语义表达，把实体识别问题转换为概念标签学习问题；将实体输入到关系形式化概念注意力模型，该注意力机制力图捕获关系主客体概念的连通内涵特征，即训练获得每个关系标签对应的主客体及它们上下文依存谓词的综合特征；通过多个关系分类器输出每对主客体的多关系标签，实现基于概念连通的多关系抽取；另外模型还可引入既有形式化概念的外延、内涵，以改善模型对语料标签的依赖，以及实体嵌套导致的标记难题。实验基于两份数据集，采用三种评价指标对模型性能进行评价分析。实验结果证明提出的模型在知识抽取上具有切实可行的效果，可改善实体嵌入、关系重叠问题。

关键词: 三元组抽取, 形式化概念, 注意力, 关系重叠

CHENG Chunlei, ZOU Jing, YE Qing, ZHANG Suhua, LAN Yong, YANG Rui. End-to-End Triple Extraction Incorporated Formal Concept of Relation[J]. Computer Engineering and Applications, 2023, 59(9): 182-189.

程春雷, 邹静, 叶青, 张素华, 蓝勇, 杨瑞. 融入关系形式化概念的端到端三元组抽取[J]. 计算机工程与应用, 2023, 59(9): 182-189.

References

[1] SINGHAL A.Introducing the knowledge graph：things，not strings[EB/OL].（2012）[2020-01-09].https：//www.blog.google/products/search/introducing-knowledge-graph-things-not.
[2] 邵礼旭，段玉聪，周长兵，等.数据、信息和知识三层图谱架构的推荐服务设计[J].计算机科学与探索，2019，13（2）：214-225.
SHAO L X，DUAN Y C，ZHOU C B，et al.Design of recommendation services based on data，information and knowledge graph architecture[J].Journal of Frontiers of Computer Science and Technology，2019，13（2）：214-225.
[3] 高龙，张涵初，杨亮.基于知识图谱与语义计算的智能信息搜索技术研究[J].情报理论与实践，2018，41（7）：42-47.
GAO L，ZHANG H C，YANG L.Intelligent information search technology based on knowledge graph and semantic computing[J].Information Studies：Theory & Application，2018，41（7）：42-47.
[4] 陈子睿，王鑫，王林，等.开放领域知识图谱问答研究综述[J].计算机科学与探索，2021，15（10）：1843-1869.
CHEN Z R，WANG X，WANG L，et al.Survey of open-domain knowledge graph question answering[J].Journal of Frontiers of Computer Science and Technology，2021，15（10）：1843-1869.
[5] GANTER B，WILLE R.Formal concept analysis[M].Berlin Heidelberg：Springer，1999.
[6] LAMPLE G，BALLESTEROS M，SUBRAMANIAN S，et al.Neural architectures for named entity recognition[C]//Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics：Human Language Technologies，2016.
[7] ZENG D，LIU K，LAI S，et al.Relation classification via convolutional deep neural network[C]//Proceedings of COLING 2014，the 25th International Conference on Computational Linguistics：Technical Papers，2014：2335-2344.
[8] NGUYEN T H，GRISHMAN R.Relation extraction：perspective from convolutional neural networks[C]//Proceedings of the 1st Workshop on Vector Space Modeling for Natural Language Processing，2015：39-48.
[9] SANTOS C N，XIANG B，ZHOU B.Classifying relations by ranking with convolutional neural networks[J].arXiv：1504.06580，2015.
[10] SOCHER R，HUVAL B，MANNING C D，et al.Semantic compositionality through recursive matrix-vector spaces[C]//Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning，2012：1201-1211.
[11] ZHOU P，SHI W，TIAN J，et al.Attention-based bidirectional long short-term memory networks for relation classification[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics（volume 2：short papers），2016：207-212.
[12] MIWA M，BANSAL M.End-to-end relation extraction using LSTMS on sequences and tree structures[J].arXiv：1601.00770，2016.
[13] ZHENG S，HAO Y，LU D，et al.Joint entity and relation extraction based on a hybrid neural network[J].Neurocomputing，2017，257：59-66.
[14] ZHENG S，WANG F，BAO H，et al.Joint extraction of entities and relations based on a novel tagging scheme[J].arXiv：1706.05075，2017.
[15] ZENG D，ZHANG H，LIU Q.Copymtl：copy mechanism for joint extraction of entities and relations with multi-task learning[C]//Proceedings of the AAAI Conference on Artificial Intelligence，2020：9507-9514.
[16] NAYAK T，NG H T.Effective modeling of encoder-decoder architecture for joint entity and relation extraction[C]//Proceedings of the AAAI Conference on Artificial Intelligence，2020：8528-8535.
[17] ZHANG R H，LIU Q，FAN A X，et al.Minimize exposure bias of Seq2Seq models in joint entity and relation extraction[J].arXiv：2009.07503，2020.
[18] BEKOULIS G，DELEU J，DEMEESTER T，et al.Joint entity recognition and relation extraction as a multi-head selection problem[J].Expert Systems with Applications，2018，114：34-45.
[19] WEI Z，SU J，WANG Y，et al.A novel cascade binary tagging framework for relational triple extraction[J].arXiv：1909.03227，2019.
[20] GANTER B，WILLE R.Formal concept analysis：mathematical foundations[M].[S.l.]：Springer Science & Business Media，2012.
[21] LI S，HE W，SHI Y，et al.DuIE：a large-scale Chinese dataset for information extraction[M].Beijing：Baidu Inc，2019.
[22] GUAN T，ZAN H，ZHOU X，et al.CMeIE：construction and evaluation of Chinese medical information extraction dataset[C]//CCF International Conference on Neural Language Processing and Chinese Computing，2020.