计算机工程与应用 ›› 2011, Vol. 47 ›› Issue (29): 69-73.

• 研发、设计、测试 • 上一篇    下一篇

基于熵约简和多核支持向量机的软件语义标注

马 喆,贲可荣,柳 玉   

  1. 海军工程大学 计算机工程系,武汉 430033
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2011-10-11 发布日期:2011-10-11

Software semantic annotation based on entropy reduction and multi-kernel SVM

MA Zhe,BEN Kerong,LIU Yu   

  1. Department of Computer Engineering,Naval University of Engineering,Wuhan 430033,China
  • Received:1900-01-01 Revised:1900-01-01 Online:2011-10-11 Published:2011-10-11

摘要: 提出一种基于条件信息熵维度约简和多核支持向量机的程序语义标注方法,相对于传统的本体语义标注,该方法有如下特点:采用机器学习的方式,实现了软件语义的自动标注;通过重采样平衡了正负样本;利用条件信息熵对面向对象程序的模块样本特征进行维度约简,降低了问题的计算复杂度和开销,并给出了代数约简的转化方法;核函数采用多个基核函数线性组合的方式,兼顾了分类的学习能力和泛化性能。标注实例表明,该方法能保证较高的标注准确率,具有较好的实用性和推广性。

关键词: 语义标注, 条件信息熵, 核函数, 支持向量机

Abstract: A novel method of software semantic annotation based on conditional information entropy reduction and multi-kernel SVM is proposed.Compared with traditional ontology semantic annotation,this method has some good theoretical characteristics,e.g. it achieves automatic software semantic annotation by using machine learning;positive and negative samples are balanced by resampling;conditional information entropy is used to reduct the sample demensions and a conversion method is given out to algebraic reduction;linear combination of several kernel functions considers both learning and generalization abilities.Annotation test results show that the CIEMK-SVM method has high classification accuracy,which proves its effectiveness and usefulness.

Key words: semantic annotation, conditional information entropy, kernel function, support vector machine