计算机工程与应用 ›› 2012, Vol. 48 ›› Issue (27): 180-185.

• 图形、图像、模式识别 • 上一篇    下一篇

应用序列特征分析基因剪接信号

马  猛1,汪  洋2   

  1. 1.安徽大学 计算机科学技术学院,合肥 230032
    2.美国北卡罗来纳大学教堂山分校 药理学系,美国 教堂山 27599-7365
  • 出版日期:2012-09-21 发布日期:2012-09-24

Application of sequential feature to analyzing gene splicing signals

MA Meng1, WANG Yang2   

  1. 1. School of Computer Science and Technology, Anhui University, Hefei 230032, China
    2. Department of Pharmacology, University of North Carolina at Chapel Hill, Chapel Hill 27599-7365, America
  • Online:2012-09-21 Published:2012-09-24

摘要: 在基因选择性剪接调控过程中,有各种剪接信号参与其中,如剪接位点、剪接调控元件等。如何识别这些剪接信号、研究其在基因组中的分布规律是一个有趣的问题。设计了一个基于序列特征的剪接信号打分算法,该算法可赋予每个信号一个分值,表示其信号强度。基于该打分算法所构建的分类器可用于预测识别新的剪接信号。应用该打分算法研究剪接位点和剪接调控元件在基因组中的分布,发现这两类信号具有互补特性。该研究提供了一种可用于分析生物序列数据的新方法,给出了一个从生物信息学角度来研究基因调控问题的新途径。

关键词: 序列特征, 基因剪接, 剪接位点, 剪接调控元件

Abstract: There are various splicing signals involved in the gene alternative splicing regulatory process, for example splice sites and splicing regulatory elements and so on. How to identify these splicing signals and explore the signals distribution rules throughout the genome is an interesting question. This paper presents a scoring algorithm based on sequential features and this method gives each signal a special score indicating its strength. The classifier constructed based on this scoring algorithm can be used to identify new splicing signals. Beside, the study on the splice sites and splicing regulatory elements distribution throughout genome using scoring system uncovers the two types of signals with complementary characteristics. This paper shows a new method for biology series’ data analysis and presents a new way for the study of regulatory sequences that control gene expression.

Key words: sequential feature, gene splicing, splice sites, splicing regulatory elements