计算机工程与应用 ›› 2008, Vol. 44 ›› Issue (22): 39-42.DOI: 10.3778/j.issn.1002-8331.2008.22.011

• 理论研究 • 上一篇    下一篇

基于样本学习的广义粗糙集知识约简

王利民1,2,毛宇婷3,李雄飞1,2   

  1. 1.吉林大学 计算机科学与技术学院,长春 130012
    2.吉林大学 符号计算与知识工程教育部重点实验室,长春 130012
    3.长春工业大学 信息传播工程学院,长春 130021
  • 收稿日期:2008-04-30 修回日期:2008-06-13 出版日期:2008-07-11 发布日期:2008-07-11
  • 通讯作者: 王利民

Knowledge reduction of general rough sets based on instance learning

WANG Li-min1,2,Mao Yu-ting3,LI Xiong-fei1,2   

  1. 1.College of Computer Science and Technology,Jilin University,Changchun 130012,China
    2.Key Lab of Symbol Computation and Knowledge Engineering of Ministry of Education,Jilin University,Changchun 130012,China
    3.Institute of Information Spreading Engineering,Chanchun University of Technology,Changchun 130021,China
  • Received:2008-04-30 Revised:2008-06-13 Online:2008-07-11 Published:2008-07-11
  • Contact: WANG Li-min

摘要: 由于条件属性在各样本的分布特性和所反映的主观特性的不同,每一个样本对应于真实情况的局部映射。建立了粗糙集理论中样本知识与信息之间的对应表示关系,给出了由属性约简求约简决策表的方法。基于后离散化策略处理连续属性,实现离散效率和信息损失之间的动态折衷。提出相对值条件互信息的概念衡量单一样本中各条件属性的相关性,可以充分利用现有数据处理不完备信息系统。即使在先验知识不足的情况下,也能通过主动学习构造新的规则补充进知识库中。拓广了粗糙集理论的应用范围,在UCI机器学习数据集上的实验结果和样例分析证明了该算法的合理性和有效性。

关键词: 粗糙集, 相对值条件互信息, 主动学习

Abstract: Since the distribution characteristics of condition attributes in different instances and subjective characteristics they reflect are different,each corresponding to a sample of the real situation in local mapping.In this paper,information theory will be integrated into the rough set algorithm learning process,and learning approach is given by the attribute reduction for decision table.Continuous attributes are handled based on post discretization strategy to balance between the loss of information and discretization efficiency.The definition of conditional mutual information between relative values is given to measure the attribute relevancy in a single sample,and the authors can fully make use of existing data to process incomplete information systems.Even in the lack of prior knowledge,the authors can also apply active learning to learn the new rules and add them to the knowledge base.Thus the application domain of rough set theory is extended.The experimental results on UCI machine learning data sets and analysis of the instances proved that,the algorithm proposed in this paper is reasonable and effective.

Key words: rough set, conditional mutual information between relative values, active learning