Computer Engineering and Applications ›› 2016, Vol. 52 ›› Issue (4): 61-65.
Previous Articles Next Articles
QIU Jiang, QIN Zheng
Online:
Published:
邱 江,秦 拯
Abstract: Due to including missing features and a large number of non-attack samples, real-time attack data set present incomplete feature distribution and skewed class distribution, which is adverse to clustering analysis. To solve this problem, a two-phase clustering algorithm for incomplete attack data set is proposed. Firstly, standard two-class support vector machine is used to separate non-attack samples and balance the class distribution. Secondly, a method of measuring the distance between incomplete samples is proposed. Then, this method is applied in the nearest-neighbor interval fuzzy C-means algorithm to implement clustering. Experimental results show that, this algorithm has better performance on clustering accuracy than existing algorithms.
Key words: clustering analysis, missing feature, support vector machine, nearest-neighbor interval
摘要: 实时攻击数据集含有缺失属性和大量非攻击样本,呈现属性分布不完全和类分布偏斜的特点,不利于聚类分析。针对此问题,提出了一种面向不完全攻击数据集的两阶段聚类算法。算法首先利用标准2-类支持向量机分离数据集中的非攻击样本,使类分布均衡。提出一种不完全样本间的距离度量方法,将该方法应用于最近邻间隔模糊C均值算法实现聚类。实验结果表明,与现有算法相比,提出的算法有效地提高了聚类准确率。
关键词: 聚类分析, 缺失属性, 支持向量机, 最近邻间隔
QIU Jiang, QIN Zheng. Two-phase clustering algorithm for incomplete attack data set[J]. Computer Engineering and Applications, 2016, 52(4): 61-65.
邱 江,秦 拯. 面向不完全攻击数据集的两阶段聚类算法[J]. 计算机工程与应用, 2016, 52(4): 61-65.
0 / Recommend
Add to citation manager EndNote|Ris|BibTeX
URL: http://cea.ceaj.org/EN/
http://cea.ceaj.org/EN/Y2016/V52/I4/61