软件项目工期预测研究综述

doi:10.3778/j.issn.1002-8331.2407-0547

摘要/Abstract

摘要： 在软件开发项目的初始阶段，相关信息较少且需求定义往往不明确，此时若能对软件项目工期进行准确预测，将会优化项目资源配置，极大节约研发成本，提高项目成功率。软件项目工期预测问题因其重要的理论价值和应用背景，一直受到学术界和工业界的广泛重视。在上述背景下，对国内外软件项目工期预测的研究成果进行了系统性总结与梳理：介绍了软件项目工期预测常用的数据集和指标特征；综述了软件项目工期预测的各类方法和性能评估的理论；汇总了软件项目工期预测的应用与典型案例；总结全文并提出未来进一步的研究方向。

关键词: 软件项目, 工期预测, 预测方法

Abstract: At the initial stage of software development projects, relevant information is limited, and requirements are often unclearly defined. If the duration of the software project can be accurately predicted at this stage, it will optimize resource allocation, significantly reduce R&D costs, and enhance the project’s success rate. The problem of software project duration prediction has been widely valued by academia and industry due to its important theoretical value and application background. This paper systematically summarizes and categorizes the research on software project duration prediction. It introduces the commonly used datasets and characteristics for software project duration prediction, reviews various methods for predicting software project duration, including experimental and performance evaluation theories, summarizes the application and typical cases of software project duration prediction, and finally concludes the paper by proposing further research directions for the future.

Key words: software project, duration prediction, prediction methods

朱庆康, 李洪波. 软件项目工期预测研究综述[J]. 计算机工程与应用, 2025, 61(8): 49-61.

ZHU Qingkang, LI Hongbo. Literature Survey of Software Project Duration Prediction[J]. Computer Engineering and Applications, 2025, 61(8): 49-61.

参考文献

[1] YANG D L, GHAURI P, SONMEZ M. Competitive analysis of the software industry in China[J]. International Journal of Technology Management, 2005, 29(1/2): 64-91.
[2] JUI S L. Innovation in China: the Chinese software industry[M]. London: Routledge, 2010.
[3] 黄鑫. 软件业加速拥抱人工智能[N]. 经济日报, 2024-08-22(6).
HUANG X. The Software industry is accelerating its embrace of artificial intelligence[N]. ECONOMIC DAILY, 2024-08-22(6).
[4] MCFARLAN F W, JIA N, WONG J. China’s growing IT services and software industry: challenges and implications[J]. MIS Quarterly Executive, 2012, 11: 3.
[5] JAN C G, CHAN C C, TENG C H. The effect of clusters on the development of the software industry in Dalian, China[J]. Technology in Society, 2012, 34(2): 163-173.
[6] 曹雅丽. 人工智能蓬勃发展 AI大模型迈入规模应用新阶段[N]. 中国工业报，2024-08-19(10).
CAO Y L. Artificial intelligence is flourishing, and AI models are entering a new stage of large-scale application[N]. China Industry News, 2024-08-19(10).
[7] 马晔风, 陈楠, 崔雪彬. 生成式人工智能技术如何影响专业型工作? ——来自软件工程行业的早期证据[J]. 劳动经济研究, 2024, 12(3): 3-34.
MA Y F, CHEN N, CUI X B. How does generative AI affect professional work? early evidence from the software engineering industry[J]. Studies in Labor Economics, 2024, 12(3): 3-34.
[8] HOU X Y, ZHAO Y J, LIU Y, et al. Large language models for software engineering: a systematic literature review[J]. ACM Transactions on Software Engineering and Methodology, 2024, 33(8): 1-79.
[9] BERLIN S, RAZ T, GLEZER C, et al. Comparison of estimation methods of cost and duration in IT projects[J]. Information and Software Technology, 2009, 51(4): 738-748.
[10] FERNáNDEZ-DIEGO M, GONZáLEZ-LADRóN-DE-GUEVARA F. Potential and limitations of the ISBSG dataset in enhancing software engineering research: a mapping review[J]. Information and Software Technology, 2014, 56(6): 527-544.
[11] CHEIKHI L, ABRAN A. Promise and ISBSG software engineering data repositories: a survey[C]//Proceedings of the Joint Conference of the 23rd International Workshop on Software Measurement and the 8th International Conference on Software Process and Product Measurement, 2013: 17-24.
[12] BOEHM B W. An experiment in small-scale application software engineering[J]. IEEE Transactions on Software Engineering, 1981, 7(5): 482-493.
[13] BOEHM B, CLARK B, HOROWITZ E, et al. Cost models for future software life cycle processes: COCOMO 2.0[J]. Annals of Software Engineering, 1995, 1(1): 57-94.
[14] COSENTINO V, CáNOVAS IZQUIERDO J L, CABOT J. A systematic mapping study of software development with GitHub[J]. IEEE Access, 2017, 5: 7173-7192.
[15] FOUSHEE B, KREIN J L, WU J, et al. Reflexivity, Raymond, and the success of open source software development[C]//Proceedings of the 17th International Conference on Evaluation and Assessment in Software Engineering, 2013: 246-251.
[16] POSPIESZNY P, CZARNACKA-CHROBOT B, KOBYLINSKI A. An effective approach for software project effort and duration estimation with machine learning algorithms[J]. Journal of Systems and Software, 2018, 137: 184-196.
[17] BOURQUE P, OLIGNY S, ABRAN A, et al. Developing project duration models in software engineering[J]. Journal of Computer Science and Technology, 2007, 22(3): 348-357.
[18] LóPEZ-MARTíN C, CHAVOYA A, MEDA-CAMPA?A M E. Use of a feedforward neural network for predicting the development duration of software projects[C]//Proceedings of the 12th International Conference on Machine Learning and Applications, 2013: 156-159.
[19] OLIGNY S, BOURQUE P, ABRAN A. An empirical assessment of project duration models in software engineering[C]//Proceedings of the 8th European Software Control and Metrics Conference, 1997.
[20] LóPEZ-MARTíN C, ABRAN A. Neural networks for predicting the duration of new software projects[J]. The Journal of Systems & Software, 2015, 101: 127-135.
[21] FERREIRA-SANTIAGO A, LóPEZ-MARTíN C, Yá?EZ-MáRQUEZ C. Metaheuristic optimization of multivariate adaptive regression splines for predicting the schedule of software projects[J]. Neural Computing and Applications, 2016, 27(8): 2229-2240.
[22] LOPEZ-MARTIN C, BANITAAN S, GARCIA-FLORIANO A, et al. Support vector regression for predicting the enhancement duration of software projects[C]//Proceedings of the 16th IEEE International Conference on Machine Learning and Applications, 2017: 562-567.
[23] CARPENTER J, WU C Y, EISTY N U. Leveraging large language models for predicting cost and duration in software engineering projects[J]. arXiv:2409.09617, 2024.
[24] WEN J F, LI S X, LIN Z Y, et al. Systematic literature review of machine learning based software development effort estimation models[J]. Information and Software Technology, 2012, 54(1): 41-59.
[25] KUMAR S P, BEHERA H S, KUMARI A K, et al. Advancement from neural networks to deep learning in software effort estimation: perspective of two decades[J]. Computer Science Review, 2020, 38: 100288.
[26] GONZáLEZ-LADRóN-DE-GUEVARA F, FERNáNDEZ-DIEGO M, LOKAN C. The usage of ISBSG data fields in software effort estimation: a systematic mapping study[J]. Journal of Systems and Software, 2016, 113: 188-215.
[27] MENDES E, LOKAN C, HARRISON R, et al. A replicated comparison of cross-company and within-company effort estimation models using the ISBSG database[C]//Proceedings of the 11th IEEE International Software Metrics Symposium, 2005: 10-36.
[28] MENDES E, LOKAN C. Replicating studies on cross-vs single-company effort models using the ISBSG database[J]. Empirical Software Engineering, 2008, 13(1): 3-37.
[29] ALMAKADMEH M, ABRAN A. The ISBSG software project repository: an analysis from six sigma measurement perspective for software defect estimation[J]. Journal of Software Engineering and Applications, 2017, 10(8): 693-720.
[30] TAHIR T, GENCEL ?, RASOOL G, et al. Early software defects density prediction: training the international software benchmarking cross projects data using supervised learning[J]. IEEE Access, 2018, 11: 141965-141986.
[31] BIBI S, TSOUMAKAS G, STAMELOS I, et al. Regression via classification applied on software defect estimation[J]. Expert Systems with Applications, 2008, 34(3): 2091-2101.
[32] FELIX E A, LEE S P. Integrated approach to software defect prediction[J]. IEEE Access, 2017, 5: 21524-21547.
[33] OKUTAN A, YILDIZ O T. Software defect prediction using Bayesian networks[J]. Empirical Software Engineering, 2014, 19(1): 154-181.
[34] 王朝, 于巧, 韩惠. 基于相似性度量的软件缺陷预测训练集推荐[J]. 计算机工程与应用, 2023, 59(9): 86-94.
WANG C, YU Q, HAN H. Similarity-based training set recommendation for software defect prediction[J]. Computer Engineering and Applications, 2023, 59(9): 86-94.
[35] BOEHM B W. Software engineering economics[J]. IEEE Transactions on Software Engineering, 1984, 10(1): 4-21.
[36] LOKAN C, MENDES E. Investigating the use of duration-based moving windows to improve software effort prediction: a replicated study[J]. Information and Software Technology, 2014, 56(9): 1063-1075.
[37] KHANDOKER A, SINT S, GESSL G, et al. Towards a logical framework for ideal MBSE tool selection based on discipline specific requirements[J]. Journal of Systems and Software, 2022, 189: 111306.
[38] QI F M, JING X Y, ZHU X K, et al. Software effort estimation based on open source projects: case study of Github[J]. Information and Software Technology, 2017, 92: 145-157.
[39] SHIHAB E, KAMEI Y, ADAMS B, et al. Is lines of code a good measure of effort in effort-aware models?[J]. Information and Software Technology, 2013, 55(11): 1981-1993.
[40] COELHO J, VALENTE M T, MILEN L, et al. Is this GitHub project maintained? measuring the level of maintenance activity of open-source projects[J]. Information and Software Technology, 2020, 122: 106274.
[41] KAPUR R, SODHI B. A defect estimator for source code: linking defect reports with programming constructs usage metrics[J]. ACM Transactions on Software Engineering and Methodology, 2020, 29(2): 1-35.
[42] RIGBY P C, GERMAN D M, COWEN L, et al. Peer review on open-source software projects: parameters, statistical models, and theory[J]. ACM Transactions on Software Engineering and Methodology, 2014, 23(4): 1-33.
[43] MOULLA D K, ABRAN A, YANG K. A data extraction algorithm from open source software project repositories for building duration estimation models: case study of Github[J]. International Journal of Software Engineering & Applications, 2020, 11(6): 31-46.
[44] MOULLA D K, ABRAN A. Duration estimation models for open source software projects[J]. International Journal of Information Technology and Computer Science, 2021, 13(1): 1-17.
[45] KITCHENHAM B, PFLEEGER L S, MCCOLL B, et al. An empirical study of maintenance and development estimation accuracy[J]. Journal of Systems and Software, 2002, 64(1): 57-77.
[46] WANG Y R, YU C Y, CHAN H H. Predicting construction cost and schedule success using artificial neural networks ensemble and support vector machines classification models[J]. International Journal of Project Management, 2012, 30(4): 470-478.
[47] ZAPATA A H, CHAUDRON M R V. An empirical study into the accuracy of it estimations and its influencing factors[J]. International Journal of Software Engineering and Knowledge Engineering, 2013, 23(4): 409-432.
[48] WAUTERS M, VANHOUCKE M. A nearest neighbour extension to project duration forecasting with artificial intelligence[J]. European Journal of Operational Research, 2017, 259(3): 1097-1111.
[49] LI J D, CHENG K W, WANG S H, et al. Feature selection: a data perspective[J]. ACM Computing Surveys, 2017, 50(6): 1-45.
[50] GUYON I, ELISSEEFF A. An introduction to variable and feature selection[J]. Journal of Machine Learning Research, 2003, 3: 1157-1182.
[51] CHEN H H, CHEN J P, DING J H. Data evaluation and enhancement for quality improvement of machine learning[J]. IEEE Transactions on Reliability, 2021, 70(2): 831-847.
[52] JAIN A, PATEL H, NAGALAPATTI L, et al. Overview and importance of data quality for machine learning tasks[C]//Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2020: 3561-3562.
[53] MEYER M A, BOOKER J M. Eliciting and analyzing expert judgment[M]. Philadelphia: Society for Industrial and Applied Mathematics, 2001.
[54] OTWAY H, VON WINTERFELDT D. Expert judgment in risk analysis and management: process, context, and pitfalls[J]. Risk Analysis, 1992, 12(1): 83-93.
[55] JORGENSEN M. Practical guidelines for expert-judgment-based software effort estimation[J]. IEEE Software, 2005, 22(3): 57-63.
[56] BOLGER F, WRIGHT G. Assessing the quality of expert judgment Issues and analysis[J]. Decision Support Systems, 1994, 11(1): 1-24.
[57] CHOUDHARY K. Parametric estimation of software systems[J]. International Journal of Soft Computing and Engineering, 2011, 1(2): 17-20.
[58] SALAM A, KHAN A, BASEER S. A comparative study for software cost estimation using COCOMO-II and Walston-Felix models[C]//Proceedings of the 1st International Conference on Innovations in Computer Science & Software Engineering, 2016: 15-16.
[59] WALSTON C E, FELIX C P. A method of programming measurement and estimation[J]. IBM Systems Journal, 1977, 16(1): 54-73.
[60] FAIRLEY R E. The influence of COCOMO on software engineering education and training[J]. Journal of Systems and Software, 2007, 80(8): 1201-1208.
[61] BOEHM B W, VALERDI R. Achievements and challenges in cocomo-based software resource estimation[J]. IEEE Software, 2008, 25(5): 74-83.
[62] DILLIBABU R, KRISHNAIAH K. Cost estimation of a software product using COCOMO II. 2000 model-a case study[J]. International Journal of Project Management, 2005, 23(4): 297-307.
[63] JORDAN M I, MITCHELL T M. Machine learning: trends, perspectives, and prospects[J]. Science, 2015, 349(6245): 255-260.
[64] ZHOU Z H. Machine learning[M]. Cham: Springer, 2021.
[65] WANG S M, HUANG L G, GAO A M, et al. Machine/deep learning for software engineering: a systematic literature review[J]. IEEE Transactions on Software Engineering, 2023, 49(3): 1188-1231.
[66] 董玉坤, 李浩杰, 位欣欣, 等. 基于程序结构与语义特征融合的软件缺陷预测[J]. 计算机工程与应用, 2022, 58(16): 84-93.
DONG Y K, LI H J, WEI X X, et al. Software defect prediction based on features fusion of program structure and semantics[J]. Computer Engineering and Applications, 2022, 58(16): 84-93.
[67] 刘红玉, 高见. 融合CBAM的违法犯罪类安卓恶意软件检测与分类模型研究[J]. 计算机工程与应用, 2025, 61(6): 317-327.
LIU H Y, GAO J. Research on detection and classification model of illegal and criminal Android malware integrating CBAM[J]. Computer Engineering and Applications, 2025, 61(6): 317-327.
[68] 张柏翰, 凌捷. 改进的基于DNN的恶意软件检测方法[J]. 计算机工程与应用, 2021, 57(10): 81-87.
ZHANG B H, LING J. Improved malware detection method based on DNN[J]. Computer Engineering and Applications, 2021, 57(10): 81-87.
[69] ZHAO W X, ZHOU K, LI J, et al. A survey of large language models[J]. arXiv:2303.18223, 2023.
[70] ZAN D G, CHEN B, ZHANG F J, et al. Large language models meet NL2Code: a survey[J]. arXiv:2212.09420, 2022.
[71] MA W, LIU S, WANG W, et al. The scope of chatgpt in software engineering: a thorough investigation[J]. arXiv:2305. 12138, 2023.
[72] YANG Y M, XIA X, LO D, et al. A survey on deep learning for software engineering[J]. ACM Computing Surveys, 2022, 54(10): 1-73.
[73] CAO J L, LI M, WEN M, et al. A study on prompt design, advantages and limitations of ChatGPT for deep learning program repair[J]. arXiv:2304.08191, 2023.
[74] TIHANYI N, JAIN R, CHARALAMBOUS Y, et al. A new era in software security: towards self-healing software via large language models and formal verification[J]. arXiv:2305.14752, 2023.
[75] LE-CONG T, LUONG D M, LE X B D, et al. Invalidator: automated patch correctness assessment via semantic and syntactic reasoning[J]. IEEE Transactions on Software Engineering, 2023, 49(6): 3411-3429.
[76] ALHAMED M, STORER T. Evaluation of context-aware language models and experts for effort estimation of software maintenance issues[C]//Proceedings of the 2022 IEEE International Conference on Software Maintenance and Evolution, 2022: 129-138.
[77] LI Y, REN Z, WANG Z Q, et al. Fine-SE: integrating semantic features and expert features for software effort estimation[C]//Proceedings of the IEEE/ACM 46th International Conference on Software Engineering, 2024: 303-314.
[78] PRIETO S A, MENGISTE E T, GARCíA D S B. Investigating the use of ChatGPT for the scheduling of construction projects[J]. Buildings, 2023, 13(4): 857.
[79] HAN W J, JIANG L X, LU T B, et al. Comparison of machine learning algorithms for software project time prediction[J]. International Journal of Multimedia and Ubiquitous Engineering, 2015, 10(9): 1-8.
[80] CONTE S D, DUNSMORE H E, SHEN Y. Software engineering metrics and models[M]. Boston: Addison-Wesley, 1986.