基于分层采样的集成k近邻说话人识别算法

计算机工程与应用 ›› 2007, Vol. 43 ›› Issue (35): 226-229.

基于分层采样的集成k近邻说话人识别算法

钱博,唐振民,李燕萍,徐利敏

南京理工大学人工智能与模式识别实验室，南京 210094

收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2007-12-11 发布日期:2007-12-11
通讯作者: 钱博

New method of optimizing k nearest neighbor ensemble for text-independent speaker recognition

QIAN Bo,TANG Zhen-min,LI Yan-ping,XU Li-min

Nanjing University of Science & Technology，Nanjing 210094，China

Received:1900-01-01 Revised:1900-01-01 Online:2007-12-11 Published:2007-12-11
Contact: QIAN Bo

摘要/Abstract

摘要： k近邻学习器将复杂的全局非线性关系映射为大量局部线性关系的组合，具有易解释、易扩展、抗噪能力强等优点，被广泛应用于说话人识别领域并取得了良好的效果。而集成学习算法因其强泛化能力和易于应用的特性得到了许多领域研究者的关注，但是研究表明通过重采样产生训练集差异的集成算法并不能有效地提高k近邻学习器系统的泛化能力。提出了一种新的BagWithProb采样算法产生训练集。实验表明，该算法可以有效地扩展训练集差异，提高集成系统性能。此外，还提出了基于环域分层采样的算法以加快k近邻识别算法在识别阶段的运算速度。

关键词: 最近邻识别器, 集成学习, 说话人识别, 分层采样

Abstract: K-Nearest Neighbor is one of the instance-based learning algorithm，it can be very competitive with the state-of-the-art classification methods.Besides simplicity，KNN has better generalization ability and is robust for noisy training data and quite effective when there is sufficiently large set of training data.So it has been widely used in speaker recognition field.Since the generalization ability of an ensemble could be significantly better than that of a single learner，ensemble learning has been a hot topic during the past years.In our paper，we intend to improve the recognition speed and accurate rate by introducing a novel method combining optimizing annular region stratified sampling k nearest neighbor with proposed BagWithProb ensemble learning algorithm.A large empirical study reported in this paper shows that the proposed algorithm can effectively improve the performance of speaker recognition system.

Key words: nearest neighbor learner, ensemble learning, speaker recognition, stratified sampling

钱博,唐振民,李燕萍,徐利敏. 基于分层采样的集成k近邻说话人识别算法[J]. 计算机工程与应用, 2007, 43(35): 226-229.

QIAN Bo,TANG Zhen-min,LI Yan-ping,XU Li-min. New method of optimizing k nearest neighbor ensemble for text-independent speaker recognition[J]. Computer Engineering and Applications, 2007, 43(35): 226-229.

[1]	吴文龙，周喜，王轶，王保全. WKAG：一种针对不平衡医保数据的欺诈检测方法[J]. 计算机工程与应用, 2021, 57(9): 247-254.
[2]	李莉，纪欣沅，宋嵩. 回环软件缺陷数量预测模型[J]. 计算机工程与应用, 2021, 57(7): 158-163.
[3]	王琴，刘盾. 结合集成学习的序贯三支情感分类方法研究[J]. 计算机工程与应用, 2021, 57(23): 211-218.
[4]	熊霖，唐万梅. 基于异构分类器集成的增量学习算法[J]. 计算机工程与应用, 2020, 56(7): 155-161.
[5]	曾春艳，马超峰，王志锋，朱栋梁，赵楠，王娟，刘聪. 深度学习框架下说话人识别研究综述[J]. 计算机工程与应用, 2020, 56(7): 8-16.
[6]	顾兆军，吴优，赵春迪，周景贤. 流量的集成学习与重采样均衡分类方法[J]. 计算机工程与应用, 2020, 56(6): 86-91.
[7]	赵宇鑫，努尔布力，艾壮. 基于集成学习投票算法的Android恶意应用检测[J]. 计算机工程与应用, 2020, 56(22): 74-82.
[8]	王得雪，林意，陈俊杰. 协同训练算法在滚动轴承故障诊断中的应用[J]. 计算机工程与应用, 2020, 56(12): 273-278.
[9]	徐浩然，许波，徐可文. 机器学习在股票预测中的应用综述[J]. 计算机工程与应用, 2020, 56(12): 19-24.
[10]	苏健民，杨岚心，景维鹏. 基于U-Net的高分辨率遥感图像语义分割方法[J]. 计算机工程与应用, 2019, 55(7): 207-213.
[11]	刘树栋，张可. 类别不均衡学习中的抽样策略研究[J]. 计算机工程与应用, 2019, 55(21): 1-17.
[12]	李哲，于梦茹. 基于多种LBP特征集成学习的车标识别[J]. 计算机工程与应用, 2019, 55(20): 134-138.
[13]	徐屹伟，刘政怡，赵悉超. 基于简单帧选择的显著性检测方法[J]. 计算机工程与应用, 2019, 55(20): 177-183.
[14]	余恩泽，努尔布力，于清. 一种基于集成学习的钓鱼网站检测方法[J]. 计算机工程与应用, 2019, 55(18): 81-88.
[15]	安琛，陈阳. 基于集成学习的动态链接预测方法[J]. 计算机工程与应用, 2018, 54(6): 110-114.