Computer Engineering and Applications ›› 2020, Vol. 56 ›› Issue (9): 148-155.DOI: 10.3778/j.issn.1002-8331.1901-0095

Previous Articles     Next Articles

Clustering Method by Combining Simplex Mapping and Entropy Weighting

AN Ning, JIANG Siyuan, TANG Chen, YANG Jiaoyun   

  1. 1.National Smart Eldercare International S&T Cooperation Base, Hefei University of Technology, Hefei 230601, China
    2.School of Computer Science and Information Engineering, Hefei University of Technology, Hefei 230601, China
  • Online:2020-05-01 Published:2020-04-29



  1. 1.合肥工业大学 国家智慧养老国际科技合作基地,合肥 230601
    2.合肥工业大学 计算机与信息学院,合肥 230601


Due to the differences between categorical attributes and numerical attributes, researchers usually need to deal with these two types of attributes differently when designing clustering methods for mixed datasets. This increases the difficulty of designing and implementing clustering methods. Besides, the information contained in different attributes varies a lot, however, current methods treat different attributes equally. This paper proposes a weighted simplex-based mapping method for mixed data clustering. It maps the categorical attributes into high dimensional numerical attributes based on simplex theory, applies entropy theory to weight different attributes to establish the similarity measurement. The measurement is integrated with K-Means framework to form a clustering method. The experiments on 6 UCI mixed datasets show that the proposed method outperforms traditional mapping method and K-Prototype method, with 2.70% and 18.33% improvement in terms of accuracy.

Key words: vector mapping, entropy-based weight, similarity measurement, mixed datasets, clustering analysis



关键词: 向量映射, 熵加权, 相似性度量, 混合数据集, 聚类分析