Computer Engineering and Applications ›› 2010, Vol. 46 ›› Issue (25): 124-126.DOI: 10.3778/j.issn.1002-8331.2010.25.037

• 数据库、信号与信息处理 • Previous Articles     Next Articles

DNA sequence classification based on ant colony optimization clustering algorithm

LIANG Bing,CHEN De-yun   

  1. College of Computer Science and Technology,Harbin University of Science and Technology,Harbin 150080,China
  • Received:2009-06-12 Revised:2010-02-22 Online:2010-09-01 Published:2010-09-01
  • Contact: LIANG Bing


梁 冰,陈德运   

  1. 哈尔滨理工大学 计算机科学与技术学院,哈尔滨 150080
  • 通讯作者: 梁 冰

Abstract: A modification of ant-based clustering algorithm for DNA sequence analysis is presented.For increasing the efficiency of ant-based clustering algorithm in terms of running time and accuracy,the modified version of ACOC has incorporated two main modifications in relation to ACA:An adaptive perception scheme occurs in the density function and a cooling scheme of α-adaptation.The features of DNA sequence are extracted according to Di-nucleotide frequency.Then pearson correlation coefficient is used to analyze the relationship.Experimental results on EMBL-DNA datasets clearly show that ACOC performs well when this paper is compared to statistics clustering and k-means and is suitable for Mass DNA sequence classification.

Key words: DNA sequence analysis, ant-based clustering algorithm, classification, feature extraction, pearson correlation coefficient

摘要: 针对目前聚类算法在分析DNA序列数据时的低效性和分类精度低问题,提出一种基于蚁群优化聚类算法(ACOC)的DNA序列分类方法,在密度函数中加入自适应感应量并应用模拟退火中的α-适应量的冷却策略,采用DNA序列分布特征对DNA序列进行特征提取,并将pearson相关系数引入蚁群聚类算法作为相似性度量。在EMBL-DNA数据库中4个数据集上进行性能测试,与统计聚类和k-means算法的比较表明,该方法具有一定的时间和精度的优越性,适于解决大规模DNA序列数据分类问题。

关键词: DNA序列分析, 蚁群聚类算法, 分类, 特征提取, person相关系数

CLC Number: