Gene name normalization based on extended semantic similarity

Computer Engineering and Applications ›› 2011, Vol. 47 ›› Issue (35): 128-131.

• 数据库、信号与信息处理 • Previous Articles Next Articles

Gene name normalization based on extended semantic similarity

HU Yuncui，LIN Hongfei，YANG Zhihao

School of Computer Science and Technology，Dalian University of Technology，Dalian，Liaoning 116024，China

Received:1900-01-01 Revised:1900-01-01 Online:2011-12-11 Published:2011-12-11

语义相似度的基因名标准化方法

胡运翠，林鸿飞，杨志豪

大连理工大学电子信息与电气工程学部，辽宁大连 116024

Abstract

Abstract: In this paper，a normalization method based on extended semantic similarity is presented to resolve the problem that description of gene symbols in biomedical databases is not rich and complete so that it is hard to make a choice from different gene symbols for the ambiguous term.In this method，extended semantic information is extracted for each gene symbol from gene ontology and MEDLINE abstracts，and the unique identifier which expresses the actual meaning of the named entities is determined depending on the similarity of the context information and extended semantic description.The experiment on Bio- Creative II gene normalization task achieves an F-measure performance of 81.2%（precision：80% recall：82.4%）.The experimental result shows that the method based on extended semantic similarity can apply to gene named entities normalization.

Key words: gene, normalization, extended semantic similarity, disambiguation

摘要： 针对生物医学数据库中基因标识符的描述信息不够丰富和完整，不能很好地区分歧义词不同含义的问题，给出了一种基于扩展语义相似度的基因名标准化方法。该方法利用MEDLINE摘要信息和基因本体描述信息，为数据库中的基因标识符生成了扩展的语义信息;然后通过比较歧义基因名的上下文信息和其不同语义描述信息之间的相似性，为歧义基因名确定能够表达真实含义的唯一基因标识符。使用BioCreative II基因标准化任务的语料，实验结果的准确率达到了80%，召回率达到了82.4%，F值达到了81.2%。从实验结果可以看出，扩展语义相似度的方法适用于生物医学领域的命名实体标准化研究。

关键词: 基因, 标准化, 扩展语义相似度, 消歧

HU Yuncui，LIN Hongfei，YANG Zhihao. Gene name normalization based on extended semantic similarity[J]. Computer Engineering and Applications, 2011, 47(35): 128-131.

胡运翠，林鸿飞，杨志豪. 语义相似度的基因名标准化方法[J]. 计算机工程与应用, 2011, 47(35): 128-131.

[1]	JIA Xiang’en, DONG Yihong, ZHU Feng, QIAN Jiangbo. Research Progress of Heterogeneous Graph Convolutional Networks [J]. Computer Engineering and Applications, 2021, 57(9): 36-49.
[2]	ZHANG Bo, XU Liming, HUANG Zhiwei, YAO Xiaopeng. Multi-objective GANs Pareto Optimality Algorithm Using Gradient Strategy [J]. Computer Engineering and Applications, 2021, 57(9): 89-95.
[3]	CHAI Xu, FANG Ming, FU Feiran, SHAO Zhen. Sight Estimation Algorithms for Examinee in Examination Room Environment [J]. Computer Engineering and Applications, 2021, 57(9): 199-206.
[4]	WU Wenlong, ZHOU Xi, WANG Yi, WANG Baoquan. WKAG：Fraud Detection Method for Imbalanced Medical Insurance Data [J]. Computer Engineering and Applications, 2021, 57(9): 247-254.
[5]	WANG Jinyu, YANG Haitao, LI Gaoyuan, ZHANG Changgong, FENG Bodi. Research Progress of Generative Adversarial Network and Its Application in Image Processing [J]. Computer Engineering and Applications, 2021, 57(8): 26-35.
[6]	YU Lei, XU Guangluan, WANG Yang, LIN Daoyu, LI Feng. Research on Multidimensional Visualization of Heterogeneous Network Based on Dynamic Projection Embedding [J]. Computer Engineering and Applications, 2021, 57(8): 145-152.
[7]	YAN Xiaoshen, GAO Qiang, ZHU Simeng, XI Xuecheng, ZHAO Wansheng. Study on Character Segmentation Algorithm of Pressed Character on Uneven Brightness Low Quality Images [J]. Computer Engineering and Applications, 2021, 57(8): 185-191.
[8]	ZHANG Dieyi, YIN Lijie. Clustering-Preserving Representation Learning on Heterogeneous Network [J]. Computer Engineering and Applications, 2021, 57(7): 144-150.
[9]	LI Jian, SUN Dasong, ZHANG Beiwei. Image Restoration Using Dual-Encoder and Adversarial Training [J]. Computer Engineering and Applications, 2021, 57(7): 192-197.
[10]	WAN Mengxiang, YAO Hanbing. GAN Model for Malicious Web Training Data Generation [J]. Computer Engineering and Applications, 2021, 57(6): 124-130.
[11]	ZHANG Rui, WU Boxiong, ZHANG Liyuan, ZHANG Bo. Human Trajectory Prediction Method for Complex Scenes [J]. Computer Engineering and Applications, 2021, 57(6): 138-143.
[12]	XU Xiaochun, DONG Hongwei, WEI Chengfeng. Application of Improved CAGAN in Virtual Try-on [J]. Computer Engineering and Applications, 2021, 57(6): 152-158.
[13]	ZOU Chengming, HU Youpu. Monocular Depth Estimation in Outdoor Scene with Generative Adversarial Network [J]. Computer Engineering and Applications, 2021, 57(6): 176-183.
[14]	XU Xiaoyuan, LI Haibo, HUANG Li. Convex Optimization Analysis of Joint Delay Tail Probability of Multi-heterogeneous Files in Cloud Storage [J]. Computer Engineering and Applications, 2021, 57(5): 88-94.
[15]	CHEN Renhe, LAI Zhenyi, QIAN Yurong. Improved Image Denoising Generative Adversarial Network Algorithm [J]. Computer Engineering and Applications, 2021, 57(5): 168-172.

Gene name normalization based on extended semantic similarity

语义相似度的基因名标准化方法

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics