计算机工程与应用 ›› 2007, Vol. 43 ›› Issue (21): 43-45.

• 学术探讨 • 上一篇    下一篇

基于网络方法的DNA序列编码区•非编码区性质研究

付 新,徐振源   

  1. 无锡江南大学 信息工程学院 理学院,江苏 无锡 214122
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2007-07-21 发布日期:2007-07-21
  • 通讯作者: 付 新

Approach to propreties of coding and non-coding DNA sequence based on complex network theory

FU Xin,XU Zhen-yuan   

  1. School of Science,School of Information Science,Wuxi Southern Yangtze University,Wuxi,Jiangsu 214122,China
  • Received:1900-01-01 Revised:1900-01-01 Online:2007-07-21 Published:2007-07-21
  • Contact: FU Xin

摘要: 利用了一种基于图论理论的方法对DNA序列(片段),其编码区及非编码区进行分析。该方法通过复杂网络研究生物体的拓扑结构,主要通过测量聚类系数(也可称:集团系数)构建网络的拓扑结构。依据DNA序列的前缀、后缀关联性质构造了所选取DNA序列(片段),其编码区和非编码区的相关网络,发现以上网络分布满足幂率特征,有较大的聚类系数(集团系数)。结果表明构建得到的网络同时满足小世界网络和无尺度网络的特征,证明DNA序列不全是随机的序列,而是有随机扰动的确定结构的序列,特别是编码区。

关键词: DNA序列, 编码区, 非编码区, 聚类系数, 复杂网络

Abstract: This paper proposes a DNA sequence analysis method based on graph theoretical concepts.The methodology investigates the topology of an organism genome through a complex network.We characterize this complex network topology by measuring the clustering coefficient.We construct a correlation network by the correlations of DNA’s prefix and suffix.The degree distribution of this network obeys a power-law and the clustering coefficient is bigger one.The coexistence of skewed degree distribution and clustering characteristic tells us that has a hierarchical structure,and that proves the DNA sequences are not the stochastic sequences from the global,but are the definite structure sequences which has the stochastic perturbation.

Key words: DNA sequence, CNDA, NCDNA, clustering coefficient, complex network