计算机工程与应用 ›› 2013, Vol. 49 ›› Issue (15): 27-31.

• 理论研究、研发设计 • 上一篇    下一篇

一种基于LDA和静态分析的代码功能识别方法

金  靖1,2,李  萌1,2,华哲邦1,2,宋怀达1,2,赵俊峰1,2,谢  冰1,2   

  1. 1.北京大学 信息科学技术学院,北京 100871
    2.北京大学 高可信软件技术教育部重点实验室,北京 100871
  • 出版日期:2013-08-01 发布日期:2013-07-31

Code function recognition approach based on LDA and static analysis

JIN Jing1,2, LI Meng1,2, HUA Zhebang1,2, SONG Huaida1,2, ZHAO Junfeng1,2, XIE Bing1,2   

  1. 1.School of Electronics Engineering and Computer Science, Peking University, Beijing 100871, China
    2.Key Lab of High Confidence Software Technologies, Ministry of Education, Peking University, Beijing 100871, China
  • Online:2013-08-01 Published:2013-07-31

摘要: 近年来,随着代码复用技术不断成熟和Internet上开源项目不断丰富,软件开发人员的开发行为也逐渐发生了变化。如今,软件开发人员在编程过程中越来越多地依赖于开源软件项目提供的功能。然而,在软件复用活动中,由于开源项目文档的不全面以及代码结构的复杂性,软件开发人员往往只能片面地了解项目的某些功能点,使得复用效率不高。针对开源项目代码丰富而文档较少这一现状,提出了一种基于LDA(Latent Dirichlet Allocation)和静态分析的代码功能识别方法,对传统LDA方法进行了扩展,帮助软件开发人员更全面地了解项目的功能点,从而更好地支持代码复用活动。

关键词: 软件复用, 代码, 隐含狄利克雷分配(LDA), 静态分析, 功能识别

Abstract: In recent years, with the rapid development of code reuse technology and open source projects on Internet, software developers’ programming activities are gradually changed. Today, software developers increasingly rely on the functions supplied by open source projects while they’re programming. However, due to the lack of documents and the complexity of code structure, the efficiency of software reuse is not high. Software developers usually only learn small parts of project’s functions instead of comprehensive understanding. In order to better support the activity of code reuse, a function recognition approach based on LDA and code static analysis technology, which is an extension of traditional LDA, is proposed to help developers better learn the functions of a project.

Key words: software reuse, source code, Latent Dirichlet Allocation(LDA), static analysis, function recognition