计算机工程与应用 ›› 2014, Vol. 50 ›› Issue (21): 259-263.

• 工程与应用 • 上一篇    下一篇

加权投票采样学习在用户信用评级中的应用

陈  念1,2,唐振民2   

  1. 1.池州学院 数学与计算机科学系,安徽 池州 247000
    2.南京理工大学 计算机科学与工程学院,南京 210094
  • 出版日期:2014-11-01 发布日期:2014-10-28

Application of user credit rating based on weighted voting sampling algorithm

CHEN Nian1,2, TANG Zhenmin2   

  1. 1.Department of Mathematics and Computer Science, Chizhou University, Chizhou, Anhui 247000, China
    2.Computer Science and Engineering College, Nanjing University of Science and Technology, Nanjing 210094, China
  • Online:2014-11-01 Published:2014-10-28

摘要: 以委员会投票查询算法为基础,提出在采样过程中动态修正分类器成员权值的加权投票方法。在对无标签样本标注价值评估中,该方法能够强化高精度分类器成员的查询贡献,降低高误差成员的投票影响,减少机器训练过程中的标注学习次数。通过在UCI的Statlog(Australian Credit Approval)数据集上对用户信用度级别进行识别,并比较于其他采样方法,证明该方法能够用较小的采样标注代价获取稳定的泛化精度。

关键词: 主动学习, 采样查询, 加权投票, 熵, 标注门槛

Abstract: In this paper, a method of weighted voting is proposed which can adjust weights of classifiers in committee during the sampling process and it is based on query by committee algorithm. In process of unlabeled sample’s quality evaluation, the method can strengthen the contribution of high precision members, reduce the influence of high error members and decrease the times of learning which is needed in machine training. By experiment on dataset of Statlog(Australian Credit Approval)and compared results with other methods, the effectiveness has been proved that the algorithm can gain stable generalization accuracy with smaller costs of samples labeling.

Key words: active learning, sampling query, weighted voting, entropy, labeling threshold