[1] YU W, JIANG Z, DONG Y, et al. ReClor: a reading comprehension dataset requiring logical reasoning[J]. arXiv:2002.04326, 2020.
[2] LIU J, CUI L, LIU H, et al. LogiQA: a challenge dataset for machine reading comprehension with logical reasoning[C]//Proceedings of the 29th International Joint Conference on Artificial Intelligence, 2020: 3622-3628.
[3] LIU H, LIU J, CUI L, et al. LogiQA 2. 0—an improved dataset for logical reasoning in natural language understanding[J]. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2023, 31: 2947-2962.
[4] HUANG Y, FANG M, CAO Y, et al. DAGN: discourse-aware graph network for logical reasoning[C]//Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021: 5848-5855.
[5] LI X, CHENG G, CHEN Z, et al. AdaLoGN: adaptive logic graph network for reasoning-based machine reading comprehension[C]//Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics, 2022: 7147-7161.
[6] XU F, LIU J, LIN Q, et al. Logiformer: a two-branch graph transformer network for interpretable logical reasoning[C]//Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2022: 1055-1065.
[7] JIAO F, GUO Y, SONG X, et al. MERIt: meta-path guided contrastive learning for logical reasoning[J]. arXiv:2203.00357, 2022.
[8] WANG S, ZHONG W, TANG D, et al. Logic-driven context extension and data augmentation for logical reasoning of text[C]//Findings of the Association for Computational Linguistics: ACL 2022, 2022: 1619-1629.
[9] 潘武, 高云鹏. 论MBA、公务员逻辑考试与批判性思维课程教学[J]. 黑河学刊, 2009(7): 26-27.
PAN W, GAO Y P. On teaching MBA and civil service logic exams and critical thinking courses[J]. Heihe Journal, 2009(7): 26-27.
[10] WIEGREFFE S, MARASOVI A. Teach me to explain: a review of datasets for explainable natural language processing[J]. arXiv:2102.12060, 2021.
[11] KOCIJAN V, LUKASIEWICZ T, DAVIS E, et al. A review of Winograd schema challenge datasets and approaches[J]. arXiv:2004.13831, 2020.
[12] TALMOR A, HERZIG J, LOURIE N, et al. CommonsenseQA: a question answering challenge targeting commonsense knowledge[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019: 4149-4158.
[13] GORDON A, KOZAREVA Z, ROEMMELE M. SemEval-2012 task 7: choice of plausible alternatives: an evaluation of commonsense causal reasoning[C]//Proceedings of the 6th International Workshop on Semantic Evaluation, 2012: 394-398.
[14] BOWMAN S R, ANGELI G, POTTS C, et al. A large annotated corpus for learning natural language inference[J]. arXiv:1508.05326, 2015.
[15] ZHANG H, ZHAO X, SONG Y. WinoWhy: a deep diagnosis of essential commonsense knowledge for answering Winograd schema challenge[J]. arXiv:2005.05763, 2020.
[16] DU L, DING X, XIONG K, et al. e-care: a new dataset for exploring explainable causal reasoning[C]//Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics, 2022: 432-446.
[17] CHEN J J, XU R, FU Z Q, et al. E-KAR: a benchmark for rationalizing natural language analogical reasoning[C]//Findings of the Association for Computational Linguistics: ACL 2022, 2022: 3941-3955.
[18] PRASAD R, DINESH N, LEE A, et al. The Penn discourse TreeBank 2.0[C]//Proceedings of the 6th International Conference on Language Resources and Evaluation, 2008.
[19] ZELLERS R, HOLTZMAN A, BISK Y, et al. HellaSwag: can a machine really finish your sentence[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2019: 4791-4800.
[20] GURURANGAN S, SWAYAMDIPTA S, LEVY O, et al. Annotation artifacts in natural language inference data[C]//Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics, 2018: 107-112.
[21] POLIAK A, NARADOWSKY J, HALDAR A, et al. Hypothesis only baselines in natural language inference[C]//Proceedings of the 7th Joint Conference on Lexical and Computational Semantics, 2018: 180-191.
[22] CHIN-YEW L. Rouge: a package for automatic evaluation of summaries[C]//Proceedings of the Workshop on Text Summarization Branches Out, 2004: 74-81.
[23] PAPINENI K. BLEU: a method for automatic evaluation of MT[C]//Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, 2002: 311-318.
[24] DEVLIN J, CHANG M W, LEE K, et al. BERT: pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics, 2019: 4171-4186.
[25] YANG Z, DAI Z, YANG Y, et al. XLNet: generalized autoregressive pretraining for language understanding[C]//Advances in Neural Information Processing Systems 32, 2019.
[26] LIU Y, OTT M, GOYAL N, et al. RoBERTa: a robustly optimized bert pretraining approach[J]. arXiv:1907.11692, 2019.
[27] LEWIS M, LIU Y, GOYAL N, et al. BART: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020: 7871-7880.
[28] RAFFEL C, SHAZEER N, ROBERTS A, et al. Exploring the limits of transfer learning with a unified text-to-text transformer[J]. The Journal of Machine Learning Research, 2020, 21(1): 5485-5551. |