蛋白质编码区的Takagi-Sugeno模糊模型辨识

doi:10.3778/j.issn.1002-8331.2009.26.065

计算机工程与应用 ›› 2009, Vol. 45 ›› Issue (26): 216-219.DOI: 10.3778/j.issn.1002-8331.2009.26.065

蛋白质编码区的Takagi-Sugeno模糊模型辨识

郭烁^1,2，朱义胜¹

1.大连海事大学信息工程学院，辽宁大连 116026
2.沈阳化工学院信息工程学院，沈阳 110142

收稿日期:2009-03-03 修回日期:2009-04-10 出版日期:2009-09-11 发布日期:2009-09-11
通讯作者: 郭烁

Prediction of protein coding regions by Takagi-Sugeno model

GUO Shuo^1,2，ZHU Yi-sheng¹

1.College of Information Engineering，Dalian Maritime University，Dalian，Liaoning 116026，China
2.College of Information Engineering，Shenyang Institute of Chemical Technology，Shenyang 110142，China

Received:2009-03-03 Revised:2009-04-10 Online:2009-09-11 Published:2009-09-11
Contact: GUO Shuo

摘要/Abstract

摘要： DNA序列编码区的辨识是基因辨识的一个重要方面。由于基因序列数据量大，导致许多统计辨识算法泛化性差、运算速度慢。根据编码区域序列和非编码区域序列相比有不同的碱基组成，提出将Takagi-Sugeno模型用于DNA序列的编码区辨识。首先，用基于模糊似然函数的模糊聚类算法确定系统的模糊划分数目，进而根据聚类个数建立相应的Takagi-Sugeno局部线性化模型，最后用最小二乘法实现模型结论参数的辨识。该算法不仅可以确定编码区的位置，还可以辨识出密码子第一位碱基的位置，对蛋白质结构的研究是非常重要的。算法简单、高效。仿真结果表明，该算法非常适合编码区辨识和其他编码区辨识算法有可比性。

关键词: DNA序列编码区, 密码子, Takagi-Sugeno模糊模型, 模糊聚类, 最小二乘法

Abstract: An important step in gene identification is to predict coding regions in DNA sequence.Due to the large volume of gene data leading to the problem of poor generalization capability and lower computing speed in many algorithms of prediction of coding region.In this paper，a Takagi-Sugeno model of DNA sequence is built based on the different composition of nucleotides in coding regions and non-coding regions.First，the system is quickly divided into several fuzzy parts using clustering algorithm based on the fuzzy likelihood function．Then，regarding cluster number as a rule number，Takagi—Sugeno fuzzy model has been built．Finally，the consequent parameters of the model are identified associating with LS.The algorithm not only can predict coding regions，but also can identify the first nucleotide of the codon in coding regions.This is very significant for accurate translation into a protein sequence.The algorithm is simple and simulation results show the proposed method is more effective for coding regions prediction than the existing coding region discovery tools.

Key words: coding region in DNA sequence, codon, Takagi-Sugeno model, clustering algorithm, Least Square（LS）

中图分类号:

TN911.72

郭烁^1,2，朱义胜¹. 蛋白质编码区的Takagi-Sugeno模糊模型辨识[J]. 计算机工程与应用, 2009, 45(26): 216-219.

GUO Shuo^1,2，ZHU Yi-sheng¹. Prediction of protein coding regions by Takagi-Sugeno model[J]. Computer Engineering and Applications, 2009, 45(26): 216-219.

[1]	程帅，吴华锋，梅骁峻. 交替非负约束框架的海洋传感网协同定位[J]. 计算机工程与应用, 2021, 57(23): 129-136.
[2]	曹林根，宓超. 集装箱箱号字符识别算法研究[J]. 计算机工程与应用, 2021, 57(15): 178-185.
[3]	严加展，陈华，李阳. 改进的模糊C-均值聚类有效性指标[J]. 计算机工程与应用, 2020, 56(9): 156-161.
[4]	王燕，亓祥惠，段亚西. 基于马尔科夫随机场的改进FCM图像分割算法[J]. 计算机工程与应用, 2020, 56(4): 197-201.
[5]	王永贵，刘凯奇. 一种优化聚类的协同过滤推荐算法[J]. 计算机工程与应用, 2020, 56(15): 66-73.
[6]	陈晓倩，刘瑞祥. 基于最小二乘策略迭代的无人机航迹规划方法[J]. 计算机工程与应用, 2020, 56(1): 191-195.
[7]	崔芳怡1，荆晓远2，董西伟2，3，吴飞2，孙莹2. 自适应蝙蝠算法优化的模糊聚类及其应用[J]. 计算机工程与应用, 2019, 55(7): 16-22.
[8]	庞丽莉1，许其清1，谢家烨1，2. 椭球约束下减小三维定位中的非视距误差[J]. 计算机工程与应用, 2019, 55(7): 115-119.
[9]	吴会会1，高淑萍1，彭弘铭2，赵怡1. 自适应模糊[C]均值聚类的数据融合算法[J]. 计算机工程与应用, 2019, 55(5): 26-35.
[10]	雷乐，王丽珍，肖清. 空间co-location模式挖掘中的模糊技术初探[J]. 计算机工程与应用, 2019, 55(21): 158-166.
[11]	余炳光，刘冬梅. 特征逐减的可能性模糊聚类算法[J]. 计算机工程与应用, 2019, 55(19): 58-65.
[12]	徐小来，房晓丽. 基于改进的直觉模糊核聚类的图像分割方法[J]. 计算机工程与应用, 2019, 55(17): 227-231.
[13]	张岭军，李聪，段云龙. 结合空间邻域信息的SAR图像变化检测[J]. 计算机工程与应用, 2019, 55(15): 185-192.
[14]	陈勇1，2，3，周晓锋2，3，李帅1，2，3. 铝电解关键指标预测方法的研究与应用[J]. 计算机工程与应用, 2019, 55(12): 250-258.
[15]	孙佳美，吴成茂. 正则化图形模糊聚类及鲁棒分割算法[J]. 计算机工程与应用, 2019, 55(11): 179-186.

蛋白质编码区的Takagi-Sugeno模糊模型辨识

Prediction of protein coding regions by Takagi-Sugeno model

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics