基于多示例学习的局部离群点改进算法

doi:10.3778/j.issn.1002-8331.1806-0075

摘要/Abstract

摘要： 在多示例学习框架下，训练数据集由若干个包组成，包内含有多个用属性-值对形式表示的示例，系统对包内的多个示例进行学习。传统的基于多示例学习的局部离群点检测算法将多示例学习框架运用到数据集上，将多示例问题转化为单示例问题进行处理。但在示例包的转换过程中采用示例内部的特征长度所占比作为权重机制，并没有考察对结果影响较大的示例，分析原因或者动态调整其权重，从而对离群点检测的效果造成影响。针对这一问题，为了充分适应数据内部的分布特征，提出了一种基于多示例学习的局部离群点改进算法FWMIL-LOF。算法采用MIL（Multi-Instance Learning）框架，在示例包的转换过程中引入描述数据重要度的权重函数，通过定义惩罚策略对权重函数做相应调整，从而确定了不同特征属性的示例在所属包中的权重。在实际企业的实时采集监控系统中，通过仿真分析，并与其他经典局部离群点检测算法进行对比，验证了改进算法在离群点检测效果方面的提高。

关键词: 多示例学习, 权重机制, 特征, 惩罚策略

Abstract: In the multi-instance learning framework, the training data set consists of several packages. The package contains multiple examples represented by attribute-value pairs. The system learns multiple examples in the package. The traditional local outlier detection algorithm based on multi-instance learning applies the multi-instance learning framework to the data set, transforming the multi-example problem into a single example problem. However, in the conversion process of the example package, the ratio of the internal feature lengths is used as the weighting mechanism, examples of significant impact on the results do not be inspected, or the reasons be analyzed or their weights be adjusted dynamically, affecting the outlier detection effect. For this problem, in order to fully adapt to the internal distribution characteristics of data, a local outlier improvement algorithm FWMIL-LOF based on multi-instance learning is proposed. The algorithm adopts MIL（Multi-Instance Learning） framework, which introduces a weight function that describes the importance of data in the conversion process of the example package, and adjusts the weight function by defining a penalty strategy. Thus, the weight of examples with different feature attributes is determined in the belonging package. In the actual enterprise’s real-time acquisition and monitoring system, through simulation analysis, and compared with other classical local outlier detection algorithms, the improvement of the outlier detection effect of the improved algorithm is verified.

Key words: Multi-Instance Learning（MIL）, weight mechanism, feature, penalty strategy

邓浩，秦岭. 基于多示例学习的局部离群点改进算法[J]. 计算机工程与应用, 2019, 55(18): 38-44.

DENG Hao, QIN Ling. Improved Algorithm of Local Outlier Based on Multi-Instance Learning[J]. Computer Engineering and Applications, 2019, 55(18): 38-44.

[1]	邓利芳，党建武，王阳萍，王松. 结合混合核特征映射的空域图像隐写分析[J]. 计算机工程与应用, 2021, 57(9): 118-125.
[2]	包志强，邢瑜，吕少卿，黄琼丹. 改进YOLO V2的6D目标姿态估计算法[J]. 计算机工程与应用, 2021, 57(9): 148-153.
[3]	陆莉霞，邹俊忠，郭玉成，张见，王蓓. 多模态融合的膝关节损伤预测[J]. 计算机工程与应用, 2021, 57(9): 225-232.
[4]	吴文龙，周喜，王轶，王保全. WKAG：一种针对不平衡医保数据的欺诈检测方法[J]. 计算机工程与应用, 2021, 57(9): 247-254.
[5]	许德刚，王露，李凡. 深度学习的典型目标检测算法研究综述[J]. 计算机工程与应用, 2021, 57(8): 10-25.
[6]	牛通，卿粼波，许盛宇，苏婕. 基于深度学习的分层关联多行人跟踪[J]. 计算机工程与应用, 2021, 57(8): 96-102.
[7]	赵圆丽，梁志剑. 基于异核卷积双注意机制的立场检测研究[J]. 计算机工程与应用, 2021, 57(8): 119-125.
[8]	张越，黄友锐，刘鹏坤. 引入注意力机制的多分辨率人体姿态估计研究[J]. 计算机工程与应用, 2021, 57(8): 126-132.
[9]	董旭彬，赵清华. 改进Mask R-CNN在航空影像目标检测的研究应用[J]. 计算机工程与应用, 2021, 57(8): 133-144.
[10]	周博文，皋军，邵星. 环状扫描的强深度森林[J]. 计算机工程与应用, 2021, 57(8): 160-168.
[11]	王玲，王家沛，王鹏，孙爽滋. 融合注意力机制的孪生网络目标跟踪算法研究[J]. 计算机工程与应用, 2021, 57(8): 169-174.
[12]	李明山，韩清鹏，张天宇，王道累. 改进SSD的安全帽检测方法[J]. 计算机工程与应用, 2021, 57(8): 192-197.
[13]	郭晓静，隋昊达. 改进YOLOv3在机场跑道异物目标检测中的应用[J]. 计算机工程与应用, 2021, 57(8): 249-255.
[14]	邹杰，李俊. 多策略协方差矩阵学习差分进化算法[J]. 计算机工程与应用, 2021, 57(7): 78-87.
[15]	韦佶宏，郑荣锋，刘嘉勇. 基于混合神经网络的恶意TLS流量识别研究[J]. 计算机工程与应用, 2021, 57(7): 107-114.

基于多示例学习的局部离群点改进算法

Improved Algorithm of Local Outlier Based on Multi-Instance Learning

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics