Feature selection based on feature similarity measure

doi:10.3778/j.issn.1002-8331.2010.20.043

Computer Engineering and Applications ›› 2010, Vol. 46 ›› Issue (20): 153-156.DOI: 10.3778/j.issn.1002-8331.2010.20.043

• 人工智能 • Previous Articles Next Articles

Feature selection based on feature similarity measure

JIANG Sheng-yi¹，WANG Lian-xi²

1.School of Informatics，Guangdong University of Foreign Studies，Guangzhou 510006，China
2.School of Management，Guangdong University of Foreign Studies，Guangzhou 510006，China

Received:2010-04-14 Revised:2010-05-20 Online:2010-07-11 Published:2010-07-11
Contact: JIANG Sheng-yi

基于特征相关性的特征选择

蒋盛益¹，王连喜²

1.广东外语外贸大学信息学院，广州 510006
2.广东外语外贸大学国际工商管理学院，广州 510006

通讯作者: 蒋盛益

Abstract

Abstract: This paper proposes a feature selection algorithm based on feature similarity measure.The method clusters features based on similarity measure and then chooses representative features from each cluster.At last，the feature subset is selected by removing the feature which is less relevant or irrelevant to class feature.Theory analysis indicates that the method with lower time complexity can be applied in feature selection for high dimensional data.The superiority of the algorithm，in terms of dimensionality reduction and classification performance，is established extensively over UCI datasets through comparing with other classic feature selection approaches.

Key words: feature selection, similarity, feature clustering, classification

摘要： 提出了一种基于特征相关性的特征选择方法。该方法以特征之间相互依赖程度（相关度）为聚类依据先对特征进行聚类，再从各特征簇中挑选出具有代表性的特征，然后在被选择出来的特征中删除与目标特征无关或是弱相关的特征，最后留下的特征作为最终的特征子集。理论分析表明该方法的运算效率高，时间复杂度低，适合于大规模数据集中的特征选择。在UCI数据集上与文献中的经典方法进行实验比较和分析，结果显示提出的特征选择方法在特征约减和分类等方面具有更好的性能。

关键词: 特征选择, 相关度, 特征聚类, 分类

CLC Number:

TP301

JIANG Sheng-yi¹，WANG Lian-xi². Feature selection based on feature similarity measure[J]. Computer Engineering and Applications, 2010, 46(20): 153-156.

蒋盛益¹，王连喜². 基于特征相关性的特征选择[J]. 计算机工程与应用, 2010, 46(20): 153-156.

[1]	ZHANG Qishan, CHEN Lulu. Slope One Algorithm Based on Grey Correlational Analysis by Method of Degree of Balance and Approach [J]. Computer Engineering and Applications, 2021, 57(9): 96-102.
[2]	WANG Yonggui, LI Qianyu. Hybrid Collaborative Filtering Recommendation Algorithm Based on KNN-GBDT [J]. Computer Engineering and Applications, 2021, 57(9): 103-108.
[3]	YANG Chunxia, LI Xinxu, WU Jiajun, LIU Tianyu. Hierarchical Network Sentiment Classification Based on Attention Interaction Mechanism [J]. Computer Engineering and Applications, 2021, 57(9): 134-139.
[4]	ZHANG Xiaowen, REN Yongfeng. Image Matching Algorithm Combining Sparse Representation and Topological Similarity [J]. Computer Engineering and Applications, 2021, 57(8): 198-203.
[5]	ZHANG Hanyu, WU Zhihao, XU Yong, CHEN Bin. Face Forensics Detection Method Based on Enhanced Convolutional Neural Networks [J]. Computer Engineering and Applications, 2021, 57(8): 220-224.
[6]	ZHANG Songcan, PU Jiexin, SI Yanna, SUN Lifan. Adaptive Improved Ant Colony Algorithm Based on Population Similarity and Its Application [J]. Computer Engineering and Applications, 2021, 57(8): 70-77.
[7]	LI Li, JI Xinyuan, SONG Song. Prediction Model for Number of Software Defects in Loop [J]. Computer Engineering and Applications, 2021, 57(7): 158-163.
[8]	HAN Weiyu, CHENG Longsheng. Research on Roling Bearing Failure Mode Classification Based on MTS and SVM [J]. Computer Engineering and Applications, 2021, 57(6): 239-246.
[9]	HUO Guangyu, ZHANG Yong, SUN Yanfeng, YIN Baocai. Research on Archive Data Intelligent Classification Based on Semantic [J]. Computer Engineering and Applications, 2021, 57(6): 247-253.
[10]	HAN Dongfang, Turdy Toheti, Askar Hamdulla. Survey on Question Classification Method in Question Answering System [J]. Computer Engineering and Applications, 2021, 57(6): 10-21.
[11]	LI Jingxing, YANG Youlong. Feature Selection of Markov Blanket for High Dimensional Data [J]. Computer Engineering and Applications, 2021, 57(6): 58-66.
[12]	YANG Fang, YIN Xi, SI Jianhui, LIU Hongyuan, WANG Xue. Mathematical Expression Similarity Calculation Method Based on Focus Clustering [J]. Computer Engineering and Applications, 2021, 57(6): 88-93.
[13]	HUANG Jinjie, LIN Jiangquan, HE Yongjun, HE Jinjie, WANG Yajun. Chinese Short Text Classification Algorithm Based on Local Semantics and Context [J]. Computer Engineering and Applications, 2021, 57(6): 94-100.
[14]	LI Shuo, LIANG Yi. Prediction Model of Execution Time for Batch Application in Spark [J]. Computer Engineering and Applications, 2021, 57(5): 79-87.
[15]	QIAN Yunyun, YANG Wenzhong, YAO Miao, LI Hailei, CHAI Yachuang. Topic Community Discovery Model Incorporating Topic Similarity Weight [J]. Computer Engineering and Applications, 2021, 57(5): 107-114.

Feature selection based on feature similarity measure

基于特征相关性的特征选择

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics