计算机工程与应用 ›› 2011, Vol. 47 ›› Issue (26): 11-13.

• 博士论坛 • 上一篇    下一篇

基于最大间隔的基因表达规则筛选

蔡瑞初1,王美华2,郝志峰1,温 雯1   

  1. 1.广东工业大学 计算机学院,广州 510006
    2.华南农业大学 信息学院,广州 510642
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2011-09-11 发布日期:2011-09-11

Max margin based gene expression pattern selection and its applications in computer aided diagnosis

CAI Ruichu1,WANG Meihua2,HAO Zhifeng1,WEN Wen1   

  1. 1.Faculty of Computer Science,Guangdong University of Technology,Guangzhou 510006,China
    2.College of Informatics,South China University of Agriculture,Guangzhou 510642,China
  • Received:1900-01-01 Revised:1900-01-01 Online:2011-09-11 Published:2011-09-11

摘要: 通过基因表达数据发现与特定疾病相关的基因表达规则,对于疾病辅助诊断有重要意义。针对现有关联规则兴趣度度量的不足,提出了基于最大间隔的基因表达规则筛选策略。该筛选策略综合考虑了基因表达规则与同类及异类样本的距离,具有较强的基因表达规则筛选能力。结合最大间隔准则和递增式关联规则挖掘算法设计的关联规则挖掘算法,能够高效地发现Top-K最大间隔基因表达规则。在实际基因表达数据集上的实验结果,验证了最大间隔基因表达规则筛选策略的有效性和挖掘算法的高效性。

关键词: 关联规则, 基因表达数据, 最大间隔, 疾病辅助诊断

Abstract: Discovery of disease related gene expression rules is of great importance to the computer aided diagnosis.The max margin based interesting measure is proposed,which improves the generalization ability of association rules by taking both the distance among inner and outer classes into consideration.A Top-K max margin association rules mining procedure is also devised to efficiently discover the interesting rules in the high dimensional gene expression data.The experimental results show the effectiveness of the interestingness measure and the efficiency of the mining procedure.

Key words: association rule, gene expression data, max-margin, computer aided diagnosis