利用连通分支对基因表示数据的聚类算法

计算机工程与应用 ›› 2008, Vol. 44 ›› Issue (9): 152-155.

• 数据库、信号与信息处理 • 上一篇下一篇

利用连通分支对基因表示数据的聚类算法

周海岩,严云洋

淮阴工学院计算机工程系，江苏淮安 223001

收稿日期:2007-03-09 修回日期:2007-09-13 出版日期:2008-03-21 发布日期:2008-03-21
通讯作者: 周海岩

Algorithm for clustering gene expression data using connected components

ZHOU Hai-yan,YAN Yun-yang

Department of Computer Engineering，Huaiyin Institute of Technology，Huaian，Jiangsu 223001，China

Received:2007-03-09 Revised:2007-09-13 Online:2008-03-21 Published:2008-03-21
Contact: ZHOU Hai-yan

摘要/Abstract

摘要： 在生命科学中，需要对物种及基因进行分类，以获得对种群固有结构的认识。利用数据聚类方法，有效地辨别／识别基因表示数据的模式，对它们进行分类。将特征相似性大的归为一类，特征相异性大的归为不同类。这对于研究基因的结构、功能、以及不同种类基因之间的关系都具有重要意义。利用图论的方法对分子生物学中基因表示数据进行初始聚类，然后再结合别的算法，如K－近邻自学习聚类算法或基于中心点的自学习聚类算法，对其进一步求精。对于某种聚类判别准则，能够产生全局最优簇。最后对算法进行了分析和讨论，并用模拟数据进行了实验验证。

关键词: 基因表示数据, 数据聚类, 簇类, 无向图, 连通分支

Abstract: In life sciences，it is necessary to classify the species and genes in order to obtain the knowledge of these species.Using data clustering algorithm can effectively distinguish/identify the mode of gene expression data and categorize them.Those with most similarity are grouped into one category and those with most difference into another category，which is very important to study the structure，function and relations between different genes.Gene expression data in biology science are initially clustered by adopting the method of graph theory and then refined by combining with other method，i.e. k-near neighbor self-learning clustering algorithm or medoid-based self-learning clustering algorithm.Global optimal clusters can be generated for a specific clustering judgment rule.At last analyses and discusses the algorithm，which are tested with simulation data.

Key words: gene expression data, data clustering, cluster, undirected graph, connected components

周海岩,严云洋. 利用连通分支对基因表示数据的聚类算法[J]. 计算机工程与应用, 2008, 44(9): 152-155.

ZHOU Hai-yan,YAN Yun-yang. Algorithm for clustering gene expression data using connected components[J]. Computer Engineering and Applications, 2008, 44(9): 152-155.

[1]	李俊. 结合最短路径改进的社会力人群疏散仿真模型[J]. 计算机工程与应用, 2021, 57(8): 264-270.
[2]	谢卫星，王晓琳，王旭阳，张静娜，李玉鹏. 基于混合数据聚类算法的异质顾客群体识别[J]. 计算机工程与应用, 2021, 57(13): 130-137.
[3]	王晰，袁绍欣. 从车牌识别数据中提取有效旅行时间算法研究[J]. 计算机工程与应用, 2020, 56(16): 241-247.
[4]	徐霜1，余琍2. 利用正则化矩阵分解技术的多视图聚类方法[J]. 计算机工程与应用, 2019, 55(14): 142-147.
[5]	唐烨，解利军，桂立业，何丽莎，郑耀. 基于聚类的流面自动布局及生成算法[J]. 计算机工程与应用, 2018, 54(12): 264-270.
[6]	王向华1，陈特放1，张必明2，颜剑1. 基于时间序列和任务调度的Web数据聚类算法[J]. 计算机工程与应用, 2016, 52(9): 159-163.
[7]	徐向平，鲁海燕，徐迅. 基于环形邻域的混沌粒子群聚类算法[J]. 计算机工程与应用, 2016, 52(2): 54-60.
[8]	郑艳1，徐国军2，覃锡忠1，贾振红1. 均衡满意度的并行单色连通分支频谱分配算法[J]. 计算机工程与应用, 2014, 50(18): 197-201.
[9]	胡耀民1，2，刘伟铭2. 基于模糊矩阵的蚁群聚类算法研究与应用 [J]. 计算机工程与应用, 2011, 47(8): 105-107.
[10]	曹利峰，陈性元，杜学绘，夏春涛. 多级安全网络区域边界访问控制模型研究[J]. 计算机工程与应用, 2011, 47(32): 118-122.
[11]	孙凌宇1，冷明1，3，邓晓春2，郁松年3. 图压缩存储格式的核排序重边匹配算法[J]. 计算机工程与应用, 2011, 47(10): 41-45.
[12]	石竑松，秦志光. 对数空间可构造的无向图遍历序列[J]. 计算机工程与应用, 2010, 46(8): 11-15.
[13]	杨仪，向长城，魏代俊. 可拓K近邻算法在数据聚类分析中的应用[J]. 计算机工程与应用, 2010, 46(21): 156-159.
[14]	周书明. 煎饼网路的容错性能研究[J]. 计算机工程与应用, 2009, 45(21): 129-131.
[15]	韩华¹,刘风鸣¹,丁永生^1,2. 基于海洋综合观测平台的海洋智能预警的研究[J]. 计算机工程与应用, 2008, 44(30): 226-228.

利用连通分支对基因表示数据的聚类算法

Algorithm for clustering gene expression data using connected components

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics