Computer Engineering and Applications ›› 2025, Vol. 61 ›› Issue (6): 64-83.DOI: 10.3778/j.issn.1002-8331.2405-0069
• Research Hotspots and Reviews • Previous Articles Next Articles
TAO Jiangyao, XI Xuefeng, SHENG Shengli, CUI Zhiming, ZUO Yan
Online:
2025-03-15
Published:
2025-03-14
陶江垚,奚雪峰,盛胜利,崔志明,左严
TAO Jiangyao, XI Xuefeng, SHENG Shengli, CUI Zhiming, ZUO Yan. Review on Enhancing Reasoning Abilities of Large Language Model Through Structured Thinking Prompts[J]. Computer Engineering and Applications, 2025, 61(6): 64-83.
陶江垚, 奚雪峰, 盛胜利, 崔志明, 左严. 结构化思维提示增强大语言模型推理能力综述[J]. 计算机工程与应用, 2025, 61(6): 64-83.
[1] ACHIAM J, ADLER S, AGARWAL S, et al. GPT-4 technical report[J]. arXiv:2303.08774, 2023. [2] ANIL R, DAI A M, FIRAT O, et al. PaLM 2 technical report[J]. arXiv:2305.10403, 2023. [3] TOUVRO H, MARTIN L, STONE K, et al. Llama 2: open foundation and fine-tuned chat models[J]. arXiv:2307.09288, 2023. [4] 张伟男, 刘挺. ChatGPT技术解析及通用人工智能发展展望[J]. 中国科学基金, 2023, 37(5): 751-757. ZHANG W N, LIU T. Technical analysis of ChatGPT and perspectives on the development of artificial general intelligence[J]. Bulletin of National Natural Science Foundation of China, 2023, 37(5): 751-757. [5] 柯沛, 雷文强, 黄民烈. 以ChatGPT为代表的大型语言模型研究进展[J]. 中国科学基金, 2023, 37(5): 714-723. KE P, LEI W Q, HUANG M L. Research progress of large language models represented by ChatGPT[J]. Bulletin of National Natural Science Foundation of China, 2023, 37(5): 714-723. [6] 舒文韬, 李睿潇, 孙天祥, 等. 大型语言模型: 原理、实现与发展[J]. 计算机研究与发展, 2024, 61(2): 351-361. SHU W T, LI R X, SUN T X, et al. Large language models: principles, implementation, and progress[J]. Journal of Computer Research and Development, 2024, 61(2): 351-361. [7] 陈慧敏, 刘知远, 孙茂松. 大语言模型时代的社会机遇与挑战[J]. 计算机研究与发展, 2024, 61(5): 1094-1103. CHEN H M, LIU Z Y, SUN M S. The social opportunities and challenges in the era of large language models[J]. Journal of Computer Research and Development, 2024, 61(5): 1094-1103. [8] 胡志强, 李朋骏, 王金龙, 等. 基于ChatGPT增强和监督对比学习的政策工具归类研究[J]. 计算机工程与应用, 2024, 60(7): 292-305. HU Z Q, LI P J, WANG J L, et al. Research on policy tools classification based on ChatGPT augmentation and supervised contrastive learning[J]. Computer Engineering and Applications, 2024, 60(7): 292-305. [9] LIU P, YUAN W, FU J, et al. Pretrain, prompt, and predict: a systematic survey of prompting methods in natural language processing[J]. ACM Computing Surveys (CSUR), 2023, 55(9): 195. [10] RADFORD A, WU J, CHILD R, et al. Language models are unsupervised multitask learners[J]. OpenAI Blog, 2019, 1(8): 9. [11] BROWN T, MANN B, RYDER N, et al. Language models are few-shot learners[C]//Advances in Neural Information Processing Systems, 2020: 1877-1901. [12] WEI J, WANG X, SCHUURMANS D, et al. Chain-of-thought prompting elicits reasoning in large language models[C]//Advances in Neural Information Processing Systems, 2022. [13] KOJIMA T, GU S, REID M, et al. Large language models are zero-shot reasoners[C]//Advances in Neural Information Processing Systems, 2022. [14] LI F, RUIJS N, LU Y, et al. Ethics & AI: a systematic review on ethical concerns and related strategies for designing with AI in healthcare[J]. AI, 2022, 4(1): 28-53. [15] KUZIOR A, POSTRZEDENIK-LOTKO K. Cognitive technologies and artificial intelligence in social perception[J]. Management Systems in Production Engineering, 2022, 30(2): 109-115. [16] NOOR A K. Potential of cognitive computing and cognitive systems[J]. Open Engineering, 2015, 5(1): 75-88. [17] ECHTERHOFF J, LIU Y, ALESSA A, et al. Cognitive bias in high-stakes decision-making with LLMs[J]. arXiv:2403. 00811, 2024. [18] COBBE K, KOSARAJU V, BAVARIAN M, et al. Training verifiers to solve math word problems[J]. arXiv:2110.14168, 2021. [19] FENG G, ZHANG B, GU Y, et al. Tree of thoughts: deliberate problem solving with large language models[J]. arXiv:2305.10601, 2023. [20] WANG L, XU W, LAN Y, et al. Plan-and-solve prompting: improving zero-shot chain-of-thought reasoning by large language models[C]//Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics, 2023. [21] HU H, LU H, ZHANG H, et al. Chain-of-symbol prompting elicits planning in large language models[J]. arXiv:2305. 10276, 2023. [22] WAN X, SUN R, DAI H, et al. Better zero-shot reasoning with self-adaptive prompting[C]//Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics, 2023. [23] ZHAO X, LI M, LU W, et al. Enhancing zero-shot chain-of-thought reasoning in large language models through logic[J]. arXiv:2309.13339, 2023. [24] KEYSER S J, MILLER G A, WALKER E, et al. Cognitive Science, 1978[R]. State of the Art Committee to the Advisors of the Alfred P. Sloan Foundation, 1978. [25] VON ECKARDT B. What is cognitive science?[M]. Cambridge, MA: MIT Press, 1995. [26] KYLLO N P C, STEPHENS D L. Cognitive abilities as determinants of success in acquiring logic skill[J]. Learning and Individual Differences, 1990, 2(2): 129-160. [27] NERLICH G. Presupposition and entailment[J]. American Philosophical Quarterly, 1965, 2(1): 33-42. [28] NILSSON N J. Logic and artificial intelligence[J]. Artificial Intelligence, 1991, 47(1/2/3): 31-56. [29] STANKEVICH L A. Cognitive technologies and artificial mind for humanoid robots[C]//Advances in Neural Computation, Machine Learning, and Cognitive Research V, 2022: 3-8. [30] CHANG E Y. Prompting large language models with the Socratic method[C]//Proceedings of the 2023 IEEE 13th Annual Computing and Communication Workshop and Conference (CCWC), 2023: 351-360. [31] BINZ M, SCHULZ E. Turning large language models into cognitive models[J]. arXiv:2306.03917, 2023. [32] HE H, ZHANG H, ROTH D. Socreval: large language models with the socratic method for reference-free reasoning evaluation[C]//Findings of the Association for Computational Linguistics (NAACL 2024), 2024: 2736-2764. [33] ZHUANG Y, LIU Q, NING Y, et al. Efficiently measuring the cognitive ability of LLMs: an adaptive testing perspective[J]. arXiv:2306.10512, 2023. [34] LI J, LI G, LI Y, et al. Structured chain-of-thought prompting for code generation[J]. arXiv:2305.06599, 2023. [35] SUN T, SHAO Y, QIAN H, et al. Black-box tuning for language-model-as-a-service[C]//Proceedings of the International Conference on Machine Learning, 2022: 20742-20754. [36] ZHANG Z, ZHANG A, LI M, et al. Automatic chain of thought prompting in large language models[C]//Proceedings of the Eleventh International Conference on Learning Representations, 2023. [37] XU W, BANBURSKI-FAHEY A, JOJIC N. Reprompting: automated chain-of-thought prompt inference through Gibbs sampling[J]. arXiv:2305.09993, 2023. [38] PRESS O, ZHANG M, MIN S, et al. Measuring and narrowing the compositionality gap in language models[J]. arXiv:2210.03350, 2022. [39] ZHOU D, SCH?RLI N, HOU L, et al. Least-to-most prompting enables complex reasoning in large language models[J]. arXiv:2205.10625, 2022. [40] ZHENG H, MISHRA S, CHEN X, et al. Take a step back: evoking reasoning via abstraction in large language models[J]. arXiv:2310.06117, 2023. [41] WANG X, WEI J, SCHUURMANS D, et al. Self-consistency improves chain of thought reasoning in language models[J]. arXiv:2203.11171, 2022. [42] BESTA M, BLACH N, KUBICEK A, et al. Graph of thoughts: solving elaborate problems with large language models[J]. arXiv:2308.09687, 2023. [43] NING X, LIN Z, ZHOU C, et al. Skeleton-of-thought: large language models can do parallel decoding[J]. arXiv:2307.15337, 2023. [44] ZHOU P, PUJARA J, REN X, et al. Self-discover: large language models self-compose reasoning structures[J]. arXiv:2402.03620, 2024. [45] SUZGUN M, KALAI A T. Meta-prompting: enhancing language models with task-agnostic scaffolding[J]. arXiv:2401. 12954, 2024. [46] HAO R, HU L, QI W, et al. ChatLLM network: more brains, more intelligence[J]. arXiv:2304.12998, 2023. [47] MILTIADOU M, SAVENJIE W C. Applying social cognitive constructs of motivation to enhance student success in online distance education[J]. AACE Review (formerly AACE Journal), 2003, 11(1): 78-95. [48] BANDURA A. Health promotion from the perspective of social cognitive theory[J]. Psychology and Health, 1998, 13(4): 623-649. [49] LI C, WANG J, ZHANG Y, et al. Large language models understand and can be enhanced by emotional stimuli[J]. arXiv:2307.11760, 2023. [50] DEN D Y, ZHANG W, CHEN Z, et al. Rephrase and respond: let large language models ask better questions for themselves[J]. arXiv:2311.04205, 2023. [51] CHEN W, MA X, WANG X, et al. Program of thoughts prompting: disentangling computation from reasoning for numerical reasoning tasks[J]. arXiv:2211.12588, 2022. [52] CHIA Y K, CHEN G, TUAN L A, et al. Contrastive chain-of-thought prompting[J]. arXiv:2311.09277, 2023. [53] SEL B, AL-TAWAHA A, KHATTAR V, et al. Algorithm of thoughts: enhancing exploration of ideas in large language models[J]. arXiv:2308.10379, 2023. [54] HENDRYCKS D, BURNS C, KADAVATH S, et al. Measuring mathematical problem solving with the math dataset[J]. arXiv:2103.03874, 2021. [55] KONCEL-KEDZIORSKI R, ROY S, AMINI A, et al. MAWPS: a math word problem repository[C]//Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2016: 1152-1157. [56] LING W, YOGATAMA D, DYER C, et al. Program induction by rationale generation: learning to solve and explain algebraic word problems[J]. arXiv:1705.04146, 2017. [57] PATEL A, BHATTACHAMISHRA S, GOYAL N, et al. Are NLP models really able to solve simple math word problems?[C]//Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021: 2080-2094. [58] TALMOR A, HERZIG J, LOURIE N, et al. CommonsenseQA: a question answering challenge targeting commonsense knowledge[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), 2019: 4149-4158. [59] GEVA M, KHASHABI D, SEGAL E, et al. Did aristotle use a laptop? A question answering benchmark with implicit reasoning strategies[J]. Transactions of the Association for Computational Linguistics, 2021, 9: 346-361. [60] BHAKTHAVATSALAM S, KHASHABI D, KHOT T, et al. Think you have solved direct-answer question answering? Try ARC-DA, the direct-answer AI2 reasoning challenge[J]. arXiv:2102.03315, 2021. [61] WANG C, LIANG S, ZHANG Y, et al. Does it make sense? And why? A pilot study for sense making and explanation[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2019: 4020-4026. [62] KICIMAN E, NESS R, SHARMA A, et al. Causal reasoning and large language models: opening a new frontier for causality[J]. arXiv:2305.00050, 2023. [63] WANG X, XU X, TONG W, et al. InferBERT: a transformer-based causal inference framework for enhancing pharmacovigilance[J]. Frontiers in Artificial Intelligence, 2021, 4: 1-12. [64] JIN Z, LIU J, LYU Z, et al. Can large language models infer causation from correlation[J]. arXiv:2306.05836, 2023. [65] HENDRYCKS D, BURNS C, BASART S, et al. Measuring massive multitask language understanding[J]. arXiv:2009. 03300, 2020. [66] WANG Y, MA X, ZHANG G, et al. MMLU-Pro: a more robust and challenging multi-task language understanding benchmark[J]. arXiv:2406.01574, 2024. [67] FENG G, ZHANG B, GU Y, et al. Towards revealing the mystery behind chain of thought: a theoretical perspective[C]//Advances in Neural Information Processing Systems, 2024. [68] PFAU J, MERRILL W, BOWMAN S R, et al. Let’s think dot by dot: hidden computation in transformer language models[J]. arXiv:2404.15758, 2024. [69] WANG X, ZHOU D. Chain-of-thought reasoning without prompting[J]. arXiv:2402.10200, 2024. [70] SHUM K, DIAO S, ZHANG T, et al. Automatic prompt augmentation and selection with chain-of-thought from labeled data[J]. arXiv:2302.12822, 2023. [71] MA H, ZHANG C, BIAN Y, et al. Fairness-guided few-shot prompting for large language models[C]//Advances in Neural Information Processing Systems, 2024. [72] YANG C, WANG X, LU Y, et al. Large language models as optimizers[J]. arXiv:2309.03409, 2023. [73] FERNANDO C, BANARSE D, MICHALOWSKI H, et al. PromptBreeder: self-referential self-improvement via prompt evolution[J]. arXiv:2309.16797, 2023. [74] YE H, LIU T, ZHANG A, et al. Cognitive mirage: a review of hallucinations in large language models[J]. arXiv:2309. 06794, 2023. [75] JI Z, LEE N, FRIESKE R, et al. Survey of hallucination in natural language generation[J]. ACM Computing Surveys, 2023, 55(12): 1-38. [76] DHULIAWALA S, KOMEILI M, XU J, et al. Chain-of-verification reduces hallucination in large language models[J]. arXiv:2309.11495, 2023. [77] XU Z, JAIN S, KANKANHALLI M, et al. Hallucination is inevitable: an innate limitation of large language models[J]. arXiv:2401.11817, 2024. [78] ZHANG Z, ZHANG A, LI M, et al. Multimodal chain-of-thought reasoning in language models[J]. arXiv:2302.00923, 2023. [79] LEE Y L, TSAI Y H, CHIU W C, et al. Multimodal prompting with missing modalities for visual recognition[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023: 14943-14952. [80] ZHANG J, WANG B, LI L, et al. Instruct me more! Random prompting for visual in-context learning[C]//Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024: 2597-2606. [81] 廖宁, 曹敏, 严骏驰. 视觉提示学习综述[J]. 计算机学报, 2024, 47(4): 790-820. LIAO N, CAO M, YAN J C. Visual prompt learning: a survey[J]. Chinese Journal of Computers, 2024, 47(4): 790-820. [82] PARANJAPE B, LUNDBERG S, SINGH S, et al. ART: automatic multi-step reasoning and tool-use for large language models[J]. arXiv:2303.09014, 2023. [83] CAI T, WANG X, MA T, et al. Large language models as tool makers[J]. arXiv:2305.17126, 2023. [84] SCHICK T, DWIVEDI-YU J, DESSI R, et al. Toolformer: language models can teach themselves to use tools[C]//Advances in Neural Information Processing Systems, 2024. [85] YANG R, SONG L, LI Y, et al. GPT4Tools: teaching large language model to use tools via self-instruction[C]//Advances in Neural Information Processing Systems, 2024. [86] YUAN L, CHEN Y, WANG X, et al. CRAFT: customizing LLMs by creating and retrieving from specialized toolsets[J]. arXiv:2309.17428, 2023. [87] PATIL S G, ZHANG T, WANG X, et al. Gorilla: large language model connected with massive APIs[J]. arXiv:2305. 15334, 2023. [88] HAO S, LIU T, WANG Z, et al. ToolkenGPT: augmenting frozen language models with massive tools via tool embeddings[C]//Advances in Neural Information Processing Systems, 2024. [89] XU B, YANG A, LIN J, et al. Expertprompting: instructing large language models to be distinguished experts[J]. arXiv:2305.14688, 2023. [90] FU Y, PENG H, KHOT T, et al. Improving language model negotiation with self-play and in-context learning from AI feedback[J]. arXiv:2305.10142, 2023. [91] JIANG D, REN X, LIN B Y, et al. LLM-Blender: ensembling large language models with pairwise ranking and generative fusion[J]. arXiv:2306.02561, 2023. [92] XIONG K, DING X, CAO Y, et al. Examining inter-consistency of large language models collaboration: an in-depth analysis via debate[C]//Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023. [93] LIANG T, HE Z, JIAO W, et al. Encouraging divergent thinking in large language models through multi-agent debate[J]. arXiv:2305.19118, 2023. [94] DU Y, LI S, TORRALBA A, et al. Improving factuality and reasoning in language models through multiagent debate[J]. arXiv:2305.14325, 2023. [95] XIE Y, KAWAGUCHI K, ZHAO Y, et al. Self-evaluation guided beam search for reasoning[J]. arXiv:2305.00633, 2023. [96] LING Z, FANG Y, LI X, et al. Deductive verification of chain-of-thought reasoning[C]//Advances in Neural Information Processing Systems, 2024. |
[1] | WANG Jingkai, QIN Donghong, BAI Fengbo, LI Lulu, KONG Lingru, XU Chen. Review of Research on Fusion Technology of Speech Recognition and Large Language Models [J]. Computer Engineering and Applications, 2025, 61(6): 53-63. |
[2] | JIANG Shuangwu, ZHANG Jiawei, HUA Liansheng, YANG Jinglin. Implementation of Meteorological Database Question-Answering Based on Large-Scale Model Retrieval-Augmentation Generation [J]. Computer Engineering and Applications, 2025, 61(5): 113-121. |
[3] | YUAN Zhongxu, LI Li, HE Fan, YANG Xiu, HAN Dongxuan. Traditional Chinese Medicine Question Answering Model Based on Chain-of-Thought and Knowledge Graph [J]. Computer Engineering and Applications, 2025, 61(4): 158-166. |
[4] | LI Yue, HONG Hailan, LI Wenlin, YANG Tao. Study on Application of Large Language Model in Constructing Knowledge Graph of Medical Cases of Rhinitis [J]. Computer Engineering and Applications, 2025, 61(4): 167-175. |
[5] | JI Yihao, REN Yizhi, YUAN Lifeng, LIU Rongke, PAN Gaoning. Event Type Induction Combined with Contrastive Learning and Iterative Optimization [J]. Computer Engineering and Applications, 2025, 61(3): 196-211. |
[6] | WANG Xinlei, WANG Shuo, ZHAI Jiazheng, XIAO Ruilin, LIAO Chenxu. Object Detection Algorithm of Aerial Image in Complex Weather Based on Multi-Task Joint Learning [J]. Computer Engineering and Applications, 2025, 61(2): 97-111. |
[7] | HUANG Shiyang, XI Xuefeng, CUI Zhiming. Research and Exploration on Chinese Natural Language Processing in Era of Large Language Models [J]. Computer Engineering and Applications, 2025, 61(1): 80-97. |
[8] | CAI Guoyong, LI Anqing. Prompt-Learning Inspired Approach to Unsupervised Sentiment Style Transfer [J]. Computer Engineering and Applications, 2024, 60(5): 146-155. |
[9] | CUI Jinman, LI Dongmei, TIAN Xuan, MENG Xianghao, YANG Yu, CUI Xiaohui. Survey on Prompt Learning [J]. Computer Engineering and Applications, 2024, 60(23): 1-27. |
[10] | YAO Yi, CHEN Zhaoyang, DU Xiaoming, YAO Tianlei, LI Qingshang, SUN Mingwei. Survey of Multimodal Knowledge Graph Construction Technology and Its Application in Military Field [J]. Computer Engineering and Applications, 2024, 60(22): 18-37. |
[11] | ZHANG Qintong, WANG Yuchao, WANG Hexi, WANG Junxin, CHEN Hai. Comprehensive Review of Large Language Model Fine-Tuning [J]. Computer Engineering and Applications, 2024, 60(17): 17-33. |
[12] | SU Youli, HU Xuanyu, MA Shijie, ZHANG Yuning, Abudukelimu Abulizi, Halidanmu Abudukelimu. Review of Research on Artificial Intelligence in Traditional Chinese Medicine Diagnosis and Treatment [J]. Computer Engineering and Applications, 2024, 60(16): 1-18. |
[13] | GAO Shuai, XI Xuefeng, ZHENG Qian, CUI Zhiming, SHENG Shengli. Review of Research on Natural Language Interfaces for Data Visualization [J]. Computer Engineering and Applications, 2024, 60(15): 24-41. |
[14] | YU Fengrui. Survey on Automated Recognition and Extraction of TTPs [J]. Computer Engineering and Applications, 2024, 60(13): 1-22. |
[15] | GU Xunxun, LIU Jianping, XING Jialu, REN Haiyu. Text Classification:Comprehensive Review of Prompt Learning Methods [J]. Computer Engineering and Applications, 2024, 60(11): 50-61. |
Viewed | ||||||||||||||||||||||||||||||||||||||||||||||||||
Full text 50
|
|
|||||||||||||||||||||||||||||||||||||||||||||||||
Abstract 31
|
|
|||||||||||||||||||||||||||||||||||||||||||||||||