
计算机工程与应用 ›› 2025, Vol. 61 ›› Issue (12): 1-11.DOI: 10.3778/j.issn.1002-8331.2409-0181
籍欣萌,昝红英,崔婷婷,张坤丽
出版日期:2025-06-15
发布日期:2025-06-13
JI Xinmeng, ZAN Hongying, CUI Tingting, ZHANG Kunli
Online:2025-06-15
Published:2025-06-13
摘要: 近年来,以ChatGPT为代表的大语言模型在多个领域受到广泛的关注,并取得优异的表现,推动了人工智能技术的新一轮发展浪潮。目前国产大模型数量已有上百个,覆盖多个行业领域,应用场景也不断扩展。为了更好地应对大模型在自然语言处理中的发展及其对通用任务和领域应用带来的冲击,对自然语言处理和大模型的发展历程进行回顾,阐述了当前大模型的相关技术以及大模型在医疗、法律、金融等垂直领域的应用,并对大模型在应用过程中面临的挑战如能力缺陷、协同问题等作出分析。最后,针对这些问题探讨了大模型在实际应用中的未来研究方向。
籍欣萌, 昝红英, 崔婷婷, 张坤丽. 大模型在垂直领域应用的现状与挑战[J]. 计算机工程与应用, 2025, 61(12): 1-11.
JI Xinmeng, ZAN Hongying, CUI Tingting, ZHANG Kunli. Status and Challenges of Large Language Models Applications in Vertical Domains[J]. Computer Engineering and Applications, 2025, 61(12): 1-11.
| [1] RAIAAN M A K, MUKTA M S H, FATEMA K, et al. A review on large language models: architectures, applications, taxonomies, open issues and challenges[J]. IEEE Access, 2024, 12: 26839-26874. [2] WU T Y, HE S Z, LIU J P, et al. A brief overview of ChatGPT: the history, status quo and potential future development[J]. IEEE/CAA Journal of Automatica Sinica, 2023, 10(5): 1122-1136. [3] ROUMELIOTIS K I, TSELIKAS N D. ChatGPT and Open-AI models: a preliminary review[J]. Future Internet, 2023, 15(6): 192. [4] GLM T, ZENG A H, et al. ChatGLM: a family of large language models from GLM-130B to GLM-4 all tools[J]. arXiv:2406.12793, 2024. [5] YANG A Y, XIAO B, WANG B N, et al. Baichuan 2: open large-scale language models[J]. arXiv:2309.10305, 2023. [6] 张钦彤, 王昱超, 王鹤羲, 等. 大语言模型微调技术的研究综述[J]. 计算机工程与应用, 2024, 60(17): 17-33. ZHANG Q T, WANG Y C, WANG H X, et al. Comprehensive review of large language model fine-tuning[J]. Computer Engineering and Applications, 2024, 60(17): 17-33. [7] 刘昆麟,屈新纪,谭芳,等 大语言模型对齐研究综述[J]. 电信科学, 2024, 40(6): 173-194. LIU K L, QU X J, TAN F, et al. Survey on large language models alignment research[J]. Telecommunications Science, 2024, 40(6): 173-194. [8] PAN J, RAZNIEWSKI S, KALO J C, et al. Large language models and knowledge graphs: opportunities and challenges[J]. arXiv:2308.06374, 2023. [9] NIE Y Q, KONG Y X, DONG X W, et al. A survey of large language models for financial applications: progress, prospects and challenges[J]. arXiv:2406.11903, 2024. [10] ZHOU H J, LIU F L, GU B Y, et al. A survey of large language models in medicine: progress, application, and challenge[J]. arXiv:2311.05112, 2023. [11] CHANG Y P, WANG X, WANG J D, et al. A survey on evaluation of large language models[J]. ACM Transactions on Intelligent Systems and Technology, 2024, 15(3): 1-45. [12] FANNI S C, FEBI M, AGHAKHANYAN G, et al. Natural language processing[M]//Introduction to artificial intelligence. Cham: Springer International Publishing, 2023: 87-99. [13] ERDEM E, KUYU M, YAGCIOGLU S, et al. Neural natural language generation: a survey on multilinguality, multimodality, controllability and learning[J]. Journal of Artificial Intelligence Research, 2022, 73: 1131-1207. [14] LECUN Y, BENGIO Y, HINTON G. Deep learning[J]. Nature, 2015, 521: 436-444. [15] SUN K L, LUO X D, LUO M Y. A survey ofPretrained language models[C]//Proceedings of the International Conference on Knowledge Science, Engineering and Management. Cham: Springer International Publishing, 2022: 442-456. [16] RANATHUNGA S, LEE E A, PRIFTI SKENDULI M, et al. Neural machine translation for low-resource languages: a survey[J]. ACM Computing Surveys, 2023, 55(11): 1-37. [17] ZHANG L L, LI Y F, WANG Q Y, et al. FPrompt-PLM: flexible-prompt on pretrained language model for continual few-shot relation extraction[J]. IEEE Transactions on Knowledge and Data Engineering, 2024, 36(12): 8267-8282. [18] BAO H, HE K, YIN X M, et al. BERT-based meta-learning approach with looking back for sentiment analysis of literary book reviews[C]//Proceedings of the 10th CCF International Conference on Natural Language Processing and Chinese Computing. Cham: Springer International Publishing, 2021: 235-247. [19] MAO R, LI X. Bridging towers of multi-task learning with a gating mechanism for aspect-based sentiment analysis and sequential metaphor identification[C]//Proceedings of the AAAI Conference on Artificial Intelligence, 2021: 13534-13542. [20] SCHAEFFER R, MIRANDA B, KOYEJO S. Are emergent abilities of large language models a mirage?[J]. arXiv:2304. 15004, 2023. [21] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]//Advances in Neural Information Processing Systems, 2017: 5998-6008. [22] DEVLIN J, CHANG M W, LEE K, et al. BERT: pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), 2019: 4171-4186. [23] TOUVRON H, LAVRIL T, IZACARD G, et al. LLaMA: open and efficient foundation language models[J]. arXiv:2302.13971, 2023. [24] COLIN R, NOAM S, ADAM R, et al. Exploring the limits of transfer learning with a unified text-to-text transformer[J]. Journal of Machine Learning Research, 2020, 21: 1-67. [25] DUBEYA, JAUHRI A, PANDEY A, et al. The LLaMA3 herd of models[J]. arXiv:2407.21783, 2024. [26] WORKSHOP B, LE SCAO T, et al. BLOOM: a 176B-parameter open-access multilingual language model[J]. arXiv:2211.05100, 2022. [27] PENEDO G, MALARTIC Q, HESSLOW D, et al. The refined web dataset for falcon LLM: outperforming curated corpora with web data, and web data only[J]. arXiv:2306. 01116, 2023. [28] YANG A, YANG B S, HUI B Y, et al. Qwen2 technical report[J]. arXiv:2407.10671, 2024. [29] CAI Z, CAO M S, CHEN H J, et al. InternLM2 technical report[J]. arXiv:2403.17297, 2024. [30] LIU P F, YUAN W Z, FU J L, et al. Pre-train, prompt, and predict: a systematic survey of prompting methods in natural language processing[J]. ACM Computing Surveys, 2023, 55(9): 1-35. [31] DONG Q X, LI L, DAI D M, et al. A survey on in-context learning[C]//Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: ACL, 2024: 1107-1128. [32] WEI J, WANG X, SCHUURMANS D, et al. Chain-of-thought prompting elicits reasoning in large language models[C]//Advances in Neural Information Processing Systems, 2022:24824-24837. [33] ZHOU D, SCH?RLI N, HOU L, et al. Least-to-most prompting enables complex reasoning in large language models[C]//Proceedings of the Eleventh International Conference on Learning Representations (ICLR 2023), 2023. [34] DING N, QIN Y J, YANG G, et al. Parameter-efficient fine-tuning of large-scale pre-trained language models[J]. Nature Machine Intelligence, 2023, 5(3): 220-235. [35] LI X L, LIANG P. Prefix-tuning: optimizing continuous prompts for generation[C]//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing. Stroudsburg: ACL, 2021: 4582-4597. [36] HU E J, SHEN Y, WALLIS P, et al. LoRA: low-rank adaptation of large language models[J]. arXiv:2106.09685, 2021. [37] REBUFFI S A, BILEN H, VEDALDI A. Learning multiple visual domains with residual adapters[C]//Advances in Neural Information Processing Systems, 2017: 506-516. [38] OUYANG L, WU J, XU J, et al. Training language models to follow instructions with human feedback[C]//Advances in Neural Information Processing Systems, 2022: 27730-27744. [39] ZHENG R, DOU S H, GAO S Y, et al. Secrets of RLHF in large language models part I: PPO[J]. arXiv:2307.04964, 2023. [40] LEWIS P, PEREZ E, PIKTUS A, et al. Retrieval-augmented generation for knowledge-intensive NLP tasks[C]//Advances in Neural Information Processing Systems, 2020: 9459-9474. [41] FEDUS W, ZOPH B, SHAZEER N. Switch transformers: scaling to trillion parameter models with simple and efficient sparsity[J]. Journal of Machine Learning Research, 2022, 23: 1-39. [42] JIANG A Q, SABLAYROLLES A, ROUX A, et al. Mixtral of experts[J]. arXiv:2401.04088, 2024. [43] LEPIKHIN D, LEE H, XU Y Z, et al. GShard: scaling giant models with conditional computation and automatic sharding[J]. arXiv:2006.16668, 2020. [44] DAI D M, DENG C Q, ZHAO C G, et al. DeepSeekMoE: towards ultimate expert specialization in mixture-of-experts language models[J]. arXiv:2401.06066, 2024. [45] LIU A, FENG B, XUE B, et al. DeepSeek-v3 technical report[J]. arXiv:2412.19437, 2024. [46] THIRUNAVUKARASU A J, TING D S J, ELANGOVAN K, et al. Large language models in medicine[J]. Nature Medicine, 2023, 29(8): 1930-1940. [47] WANG H C, LIU C, XI N W, et al. HuaTuo: tuning LLaMA model with Chinese medical knowledge[J]. arXiv:2304. 06975, 2023. [48] YANG S H, ZHAO H J, ZHU S B, et al. Zhongjing: enhancing the Chinese medical capabilities of large language model through expert feedback and real-world multi-turn dialogue[C]//Proceedings of the AAAI Conference on Artificial Intelligence, 2024: 19368-19376. [49] 陈龙飞, 高鑫, 侯皓天, 等. 生成式大语言模型在中文放射医学领域的应用研究[J]. 计算机科学与探索, 2024, 18(9): 2337-2348. CHEN L F, GAO X, HOU H T, et al. Application of generative large language models in Chinese radiology domain[J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(9): 2337-2348. [50] LI Z M, ZHANG J W, LIN Q, et al. Hunyuan-DiT: a powerful multi-resolution diffusion transformer with fine-grained Chinese understanding[J]. arXiv:2405.08748, 2024. [51] LIN X Y, XU C, XIONG Z P, et al. PanGu Drug Model: learn a molecule like a human[J]. Science China Life Sciences, 2023, 66(4): 879-882. [52] YUE S B, LIU S J, ZHOU Y X, et al. LawLLM: intelligent legal system with legal reasoning and verifiable retrieval[C]//Proceedings of the International Conference on Database Systems for Advanced Applications. Singapore: Springer Nature Singapore, 2024: 304-321. [53] HUANG Q Z, TAO M X, ZHANG C, et al. Lawyer LLaMA technical report[J]. arXiv:2305.15062, 2023. [54] CUI J, LI Z, YAN Y, et al. ChatLaw: open-source legal large language model with integrated external knowledge bases[J]. arXiv:2306.16092, 2023. [55] CHEN W, WANG Q S, LONG Z F, et al. DISC-FinLLM: a Chinese financial large language model based on multiple experts fine-tuning[J]. arXiv:2310.15205, 2023. [56] YANG Y, TANG Y X, TAM K Y. InvestLM: a large language model for investment using financial domain instruction tuning[J]. arXiv:2309.13064, 2023. [57] WANG N, YANG H, WANG C. FinGPT: instruction tuning benchmark for open-source large language models in financial datasets[C]//Proceedings of the NeurIPS 2023 Workshop on Instruction Tuning and Instruction Following, 2023. [58] KASNECI E, SESSLER K, KüCHEMANN S, et al. Chat-GPT for good?On opportunities and challenges of large language models for education[J]. Learning and Individual Differences, 2023, 103: 102274. [59] DAN Y H, LEI Z K, GU Y Y, et al. EduChat: a large-scale language model-based chatbot system for intelligent education[J]. arXiv:2308.02773, 2023. [60] CHEN J Y, WU T, JI W, et al. WisdomBot: tuning large language models with artificial intelligence knowledge[J]. Frontiers of Digital Education, 2024, 1(2): 159-170. [61] WANG P, WEI X, HU F X, et al. TransGPT: multi-modal generative pre-trained transformer for transportation[J]. arXiv:2402.07233, 2024. [62] ZHU H, ZHANG W J, HUANG N X, et al. PlanGPT: enhancing urban planning with tailored language model and efficient retrieval[J]. arXiv:2402.19273, 2024. [63] 刘昕, 高会泉, 邵长恒, 等. 融合知识推理与相似度检索的民众诉求大模型构建与应用[J]. 计算机科学与探索, 2024, 18(11): 2940-2953. LIU X, GAO H Q, SHAO C H, et al. Construction and application of large language model for public complaints with knowledge reasoning and similarity retrieval[J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(11): 2940-2953. [64] BANG Y J, CAHYAWIJAYA S, LEE N, et al. A multitask, multilingual, multimodal evaluation of ChatGPT on reasoning, hallucination, and interactivity[C]//Proceedings of the 13th International Joint Conference on Natural Language Processing and the 3rd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics. Stroudsburg: ACL, 2023: 675-718. [65] ZHAO H Y, CHEN H J, YANG F, et al. Explainability for large language models: a survey[J]. ACM Transactions on Intelligent Systems and Technology, 2024, 15(2): 1-38. [66] KIRK H R, VIDGEN B, R?TTGER P, et al. The benefits, risks and bounds of personalizing the alignment of large language models to individuals[J]. Nature Machine Intelligence, 2024, 6(4): 383-392. |
| [1] | 董磊, 吴福居, 史健勇, 潘龙飞. 基于大语言模型的施工安全多模态知识图谱的构建与应用[J]. 计算机工程与应用, 2025, 61(9): 325-333. |
| [2] | 史张龙, 周喜, 王震, 马博, 杨雅婷. 多任务增强的文本生成式事件要素抽取方法[J]. 计算机工程与应用, 2025, 61(9): 168-176. |
| [3] | 任海玉, 刘建平, 王健, 顾勋勋, 陈曦, 张越, 赵昌顼. 基于大语言模型的智能问答系统研究综述[J]. 计算机工程与应用, 2025, 61(7): 1-24. |
| [4] | 王敬凯, 秦董洪, 白凤波, 李路路, 孔令儒, 徐晨. 语音识别与大语言模型融合技术研究综述[J]. 计算机工程与应用, 2025, 61(6): 53-63. |
| [5] | 陶江垚, 奚雪峰, 盛胜利, 崔志明, 左严. 结构化思维提示增强大语言模型推理能力综述[J]. 计算机工程与应用, 2025, 61(6): 64-83. |
| [6] | 江双五, 张嘉玮, 华连生, 杨菁林. 基于大模型检索增强生成的气象数据库问答模型实现[J]. 计算机工程与应用, 2025, 61(5): 113-121. |
| [7] | 余城旭, 张宇来. 基于微调的深度学习后门防御研究[J]. 计算机工程与应用, 2025, 61(5): 155-164. |
| [8] | 肖宇, 肖菁, 林桂锦, 倪荣森, 冼嘉荣, 袁基保. 可解释性逻辑推理数据集的构建和研究[J]. 计算机工程与应用, 2025, 61(4): 114-121. |
| [9] | 苑中旭, 李理, 何凡, 杨秀, 韩东轩. 融合思维链与知识图谱的中医问答模型[J]. 计算机工程与应用, 2025, 61(4): 158-166. |
| [10] | 李玥, 洪海蓝, 李文林, 杨涛. 大语言模型构建鼻炎医案知识图谱的应用研究[J]. 计算机工程与应用, 2025, 61(4): 167-175. |
| [11] | 刘悦, 李化义, 张世杰, 张超, 赵祥天. 面向视觉惯导的导航系统初始化技术综述[J]. 计算机工程与应用, 2025, 61(2): 1-18. |
| [12] | 刘雪颖, 云静, 李博, 史晓国, 张钰莹. 基于大型语言模型的检索增强生成综述[J]. 计算机工程与应用, 2025, 61(13): 1-25. |
| [13] | 贾哲源, 金凤林, 何源. 空天地网络智能流量卸载技术研究综述[J]. 计算机工程与应用, 2025, 61(13): 46-61. |
| [14] | 王昱婷, 陈波, 闫强, 范意兴, 余智华, 郭嘉丰. 基于问题导向提示学习和多路推理的检索增强生成问答[J]. 计算机工程与应用, 2025, 61(12): 120-128. |
| [15] | 孟祥仲, 夏鸿斌, 刘渊. 自适应知识增强的可控故事生成模型[J]. 计算机工程与应用, 2025, 61(12): 129-140. |
| 阅读次数 | ||||||
|
全文 |
|
|||||
|
摘要 |
|
|||||