计算机工程与应用 ›› 2025, Vol. 61 ›› Issue (4): 114-121.DOI: 10.3778/j.issn.1002-8331.2309-0458

• 模式识别与人工智能 • 上一篇    下一篇

可解释性逻辑推理数据集的构建和研究

肖宇,肖菁,林桂锦,倪荣森,冼嘉荣,袁基保   

  1. 1.华南师范大学 人工智能学院,广东 佛山 528000
    2.华南师范大学 计算机学院,广州 510631
  • 出版日期:2025-02-15 发布日期:2025-02-14

Construction and Study of Explainable Logical Reasoning Dataset

XIAO Yu, XIAO Jing, LIN Guijin, NI Rongsen, XIAN Jiarong, YUAN Jibao   

  1. 1.School of Artificial Intelligence, South China Normal University, Foshan, Guangdong 528000, China
    2.School of Computer, South China Normal University, Guangzhou 510631, China
  • Online:2025-02-15 Published:2025-02-14

摘要: 逻辑推理能力对于机器和人类理解自然语言具有重要的意义。逻辑推理问题的解释是对逻辑推理过程的阐述和说明,但在已有的测试机器逻辑推理能力的数据集中缺乏这种解释信息。针对该问题,创建了一个可解释性逻辑推理的中英文数据集(explainable logical reasoning,Ex-LoR),该数据集包含3?411个逻辑推理问题与解释数据,并按照推理方法将这些问题分为六类。共设计两个任务:逻辑推理问答任务和解释生成任务。利用多个语言模型在该数据集上进行实验与分析,实验结果表明,现有语言模型尚不能很好地对逻辑推理问题进行解答并生成合理的解释,因此让机器掌握逻辑推理能力具有一定的挑战性。提出的逻辑推理数据集与实验结果可作为后续研究的基准。

关键词: 逻辑推理, 中英文数据集, 可解释性, 自然语言处理

Abstract: Logical reasoning ability is crucial to understand natural language by machines and humans. The explanation of logical reasoning problems is an elaboration and description of the logical reasoning process. However, such explanations are lacking in current logical reasoning benchmarks. For this problem, this paper creates a Chinese and English dataset called explainable logical reasoning (Ex-LoR). This dataset contains 3411 logical reasoning problems with explanation data, and categorizes these problems into 6 classes according to their reasoning methods. This paper designs two tasks:logical reasoning question and answer task, and explanation generation task. Subsequently, this paper conducts experiments and analysis on this dataset by using several language models. The results show that the existing language models are still unable to well answer logical reasoning questions and generate reasonable explanations. Therefore, it is challenging to equip machines with logical reasoning capabilities. The logical reasoning dataset and experimental results presented in this paper can be used as a benchmark for subsequent research.

Key words: logical reasoning, Chinese and English dataset, explainable, natural language processing