计算机工程与应用 ›› 2008, Vol. 44 ›› Issue (13): 30-32.

• 博士论坛 • 上一篇    下一篇

基于粗糙集理论的属性离散化算法

陈 昊1,2,张 旻1,2,杨俊安1,2   

  1. 1.解放军电子工程学院,合肥 230037
    2.安徽省电子制约技术重点实验室,合肥 230037
  • 收稿日期:2007-12-25 修回日期:2008-01-28 出版日期:2008-05-01 发布日期:2008-05-01
  • 通讯作者: 陈 昊

Method of data discretization based on rough set theory

CHEN Hao1,2,ZHANG Min1,2,YANG Jun-an1,2   

  1. 1.Electronic Engineering Institute,Hefei 230037,China
    2.Key Laboratory of Electronic Restriction,Anhui Province,Hefei 230037,China
  • Received:2007-12-25 Revised:2008-01-28 Online:2008-05-01 Published:2008-05-01
  • Contact: CHEN Hao

摘要: 决策系统中连续属性离散化,即将一个连续属性分为若干属性区间并为每个区间确定一个离散型数值,对后继阶段的机器学习具有重要的意义。首先研究了满足决策系统最优划分的一种计算候选断点集合的算法,然后在基于条件属性重要度和贪心算法的基础上提出了一种确定结果断点子集的新启发式算法。所提出的属性离散算法考虑并体现了粗糙集理论的基本特点和优点,并能取得较理想的连续属性离散化结果。

关键词: 粗糙集, 最优划分, 离散化, 候选断点, 结果断点

Abstract: The discretization of continuous attributes values of a decision system which divides continuous values into different space and allocates some discrete values to each space is always with great contribution to the machine learning.This paper studies a new algorithm of computing candidate cuts for best partition in decision system at first,and proposes one heuristic method based on the importance of condition attributes and greedy algorithm.The two algorithms consider specialty of rough set and embody the advantages of this theory.Moreover,excellent discretization results may be expected from them.

Key words: rough set, best partition, discretization, candidate cuts, result cuts