计算机工程与应用 ›› 2025, Vol. 61 ›› Issue (5): 32-42.DOI: 10.3778/j.issn.1002-8331.2405-0396

• 热点与综述 • 上一篇    下一篇

轻梯度提升机算法的发展与应用

魏佳妹,袁书娟,孔闪闪,杨爱民,赵晨颖   

  1. 1.华北理工大学 理学院,河北 唐山 063000
    2.华北理工大学 铁矿石优选与铁前工艺智能化河北省工程研究中心,河北 唐山 063000
    3.华北理工大学 河北省数据科学与应用重点实验室,河北 唐山 063000
    4.华北理工大学 唐山市工程计算重点实验室,河北 唐山 063000
    5.华北理工大学 唐山市智能工业与图像处理技术创新中心,河北 唐山 063000
  • 出版日期:2025-03-01 发布日期:2025-03-01

Development and Application of Light Gradient Boosting Machine

WEI Jiamei, YUAN Shujuan, KONG Shanshan, YANG Aimin, ZHAO Chenying   

  1. 1.College of Science, North China University of Science and Technology, Tangshan, Hebei 063000, China
    2.Hebei Engineering Research Center for the Intelligentization of Iron Ore Optimization and Ironmaking Raw Materials Preparation Processes, North China University of Science and Technology, Tangshan, Hebei 063000, China
    3.Hebei Key Laboratory of Data Science and Application, North China University of Science and Technology, Tangshan, Hebei 063000, China
    4.The Key Laboratory of Engineering Computing in Tangshan City, North China University of Science and Technology, Tangshan, Hebei 063000, China
    5.Tangshan Intelligent Industry and Image Processing Technology Innovation Center, North China University of Science and Technology, Tangshan, Hebei 063000, China
  • Online:2025-03-01 Published:2025-03-01

摘要: 轻梯度提升机算法(light gradient boosting machine,LightGBM)是机器学习领域中比较强大的算法之一,LightGBM采用高效的树学习算法,以更快地训练模型,其独特的直方图分桶方法和基于梯度的单边叶子生长技术降低了内存的使用和计算成本。LightGBM被广泛应用于医疗、自然语言处理、金融、工业制造等领域。然而,LightGBM在高维数据处理、类别特征处理、模型解释性等方面仍面临许多挑战。目前,解决这些问题的方法主要集中在特征工程、可视化、模型混合等方面,并取得了很好的效果。介绍了决策树家族的相关算法原理和变体研究;对LightGBM的原理、优缺点进行梳理,归纳出算法所面临的挑战,并指出LightGBM未来的研究热点和难点;对LightGBM的发展进行了总结和展望。

关键词: 轻梯度提升机算法, 决策树, 集成学习, 机器学习

Abstract: Light gradient boosting machine (LightGBM) is one of the more powerful algorithms in the field of machine learning. LightGBM uses an efficient tree learning algorithm to train models faster. Its unique histogram bucketing method and gradient-based one-sided leaf growing technique reduce memory usage and computational cost. LightGBM is widely used in medical, natural language processing, finance, industrial manufacturing and other fields. However, LightGBM still faces many challenges in high-dimensional data processing, category feature processing, and model interpretability, etc. At present, the methods to solve these problems mainly focus on feature engineering, visualization, model mixing, etc, and have achieved good results. Firstly, the algorithm principles and variants of the decision tree family are introduced. Secondly, the principles, advantages and disadvantages of LightGBM are sorted out, the challenges faced by the algorithm are summarized, and the future research hot spots and difficulties of LightGBM are pointed out. Finally, the development of LightGBM is summarized and prospected.

Key words: light gradient boosting machine (LightGBM), decision tree, ensemble learning, machine learning