Chinese-Thai Bilingual Name Alignment with Merging Name Knowledge Distribution Characteristics

doi:10.3778/j.issn.1002-8331.1809-0240

Computer Engineering and Applications ›› 2019, Vol. 55 ›› Issue (23): 163-169.DOI: 10.3778/j.issn.1002-8331.1809-0240

Previous Articles Next Articles

Chinese-Thai Bilingual Name Alignment with Merging Name Knowledge Distribution Characteristics

ZHANG Jinpeng, SU Jiao, YANG Bei, ZHANG Zhan

1.Center of Information Management, Yunnan University of Finance and Economics, Kunming 650221, China
2.School of International Languages and Cultures, Yunnan University of Finance and Economics, Kunming 650221, China
3.School of Information Engineering, Wuchang University of Technology, Wuhan 430223, China
4.School of Data and Compute Science, Sun Yat-sen University, Guangzhou 510006, China

Online:2019-12-01 Published:2019-12-11

融合人名知识分布特征的汉泰双语人名对齐

张金鹏，苏姣，杨蓓，张占

1.云南财经大学信息管理中心，昆明 650221
2.云南财经大学国际语言文化学院，昆明 650221
3.武昌理工学院信息工程学院，武汉 430223
4.中山大学数据科学与计算机学院，广州　510006

Abstract

Abstract: The study of bilingual name alignment method directly affects the effect of cross-language information processing. For the pronunciation of Chinese is quite different from Thai, and the resources of Chinese-Thai bilingual corpus are limited, and the present transliteration bilingual name alignment model based on statistics is not enough to solve those problems, this paper proposes a method which based on transliteration features, merges the similarity of the name knowledge distribution characteristics. Firstly, it calculates the similarity characteristics of bilingual name transliteration. Then the similarity of knowledge distribution characteristics between Chinese and Thai names is calculated by Chi-square test and others. Support vector machine is used to learn two features of translation of Chinese-Thai personal names to generate personal names translation pair classifier，the alignment results are generated by optimizing the classifier classification results. Experimental results show that this method has also achieved better results, even if bilateral people’s pronunciation is quite difference and lacking of bilingual corpus resources.

Key words: Chinese, Thai, bilingual name alignment, name knowledge distribution, adjusting and optimizing classification results

摘要： 双语人名对齐方法研究直接影响到跨语言信息处理的效果，由于泰语与汉语的发音差异大，汉泰双语平行语料库资源有限，基于统计的音译人名对齐模型难以解决汉泰双语人名对齐问题，提出一种在音译特征基础上融合人名知识分布特征相似性的汉泰双语人名对齐方法。计算双语人名音译相似度特征，通过卡方检验等计算汉语人名与泰语人名的知识分布相似度特征，借助支持向量机学习汉泰人名翻译对的两种特征生成人名翻译对分类器，对分类器分类结果调优生成对齐结果。实验结果表明该方法在汉泰人名发音差异大和缺少双语语料资源支持的情况下取得了较好效果。

关键词: 汉语, 泰语, 双语人名对齐, 人名知识分布, 分类结果调优

ZHANG Jinpeng, SU Jiao, YANG Bei, ZHANG Zhan. Chinese-Thai Bilingual Name Alignment with Merging Name Knowledge Distribution Characteristics[J]. Computer Engineering and Applications, 2019, 55(23): 163-169.

张金鹏，苏姣，杨蓓，张占. 融合人名知识分布特征的汉泰双语人名对齐[J]. 计算机工程与应用, 2019, 55(23): 163-169.

[1]	ZHAO Yuanli, LIANG Zhijian. Research on Stance Detection Based on Dual Attention Mechanism of Heteronuclear Convolution [J]. Computer Engineering and Applications, 2021, 57(8): 119-125.
[2]	YANG Qian, GU Lei. Chinese Named Entity Recognition Based on Denoising Joint Character-Word Model [J]. Computer Engineering and Applications, 2021, 57(7): 151-157.
[3]	Hasan Wumaier, Sirajahmat Ruzmamat, Xireaili Hairela, LIU Wenqi, Tuergen Yibulayin, WANG Liejun, Wayit Abulizi. Bi-directional Uyghur-Chinese Neural Machine Translation with Marked Syllables [J]. Computer Engineering and Applications, 2021, 57(4): 161-168.
[4]	BAI Jianfeng, MIAO Fuyou. Constructing Ideal [(t,k,n)] Tightly Coupled Secret Sharing Scheme [J]. Computer Engineering and Applications, 2021, 57(3): 125-129.
[5]	ZHOU Zhou, HAN Fang, WANG Zhijie. Application of Improved SSD Algorithm in Chinese Sign Language Recognition [J]. Computer Engineering and Applications, 2021, 57(3): 156-161.
[6]	HUANG Zijun, CHEN Qi, LUO Wenbing. Chinese Character Generation Method Based on Deep Learning [J]. Computer Engineering and Applications, 2021, 57(17): 29-36.
[7]	JIAO Kainan, LI Xin, ZHU Rongchen. Overview of Chinese Domain Named Entity Recognition [J]. Computer Engineering and Applications, 2021, 57(16): 1-15.
[8]	LIN Yanan, CHEN Wanqing, ZHENG Shijue, YANG Qing. Research on Application of AR in Display of Chinese Historical Allusions [J]. Computer Engineering and Applications, 2021, 57(14): 275-280.
[9]	TIAN Zihan, LI Xin. Research on Chinese Event Detection Method Based on BERT-CRF Model [J]. Computer Engineering and Applications, 2021, 57(11): 135-139.
[10]	LIU Hong, WANG Lie. Method of Combining Convolutional Neural Network with Cosine Similarity Algorithm to Recognize Chinese Characters [J]. Computer Engineering and Applications, 2020, 56(8): 130-135.
[11]	QIN Chaoyong, ZHENG Peng, ZHANG Xiao. Offline Handwritten Chinese Character Recognition Based on MQDF-DBM Model [J]. Computer Engineering and Applications, 2020, 56(7): 141-146.
[12]	CHENG Yage, HU Mingsheng, GONG Bei, WANG Lipeng, XU Erfeng. Dynamic Threshold Signature Scheme with Strong Forward Security [J]. Computer Engineering and Applications, 2020, 56(5): 125-134.
[13]	ZHANG Hongze, HONG Zheng, ZHOU Shengli, FENG Wenbo. Fuzzing Optimization Method Based on Protocol State Migration Traversal [J]. Computer Engineering and Applications, 2020, 56(4): 82-91.
[14]	LIU Xiaoan, PENG Tao. Research on Chinese Scenic Spot Named Entity Recognition Based on Convolutional Neural Network [J]. Computer Engineering and Applications, 2020, 56(4): 140-145.
[15]	LUO Jigen, DU Jianqiang, NIE Bin, LI Huan, NIE Jianhua, CHEN Yufeng. Random Forest Optimization Method Based on Cluster Undersampling Strategy [J]. Computer Engineering and Applications, 2020, 56(22): 166-172.

Chinese-Thai Bilingual Name Alignment with Merging Name Knowledge Distribution Characteristics

融合人名知识分布特征的汉泰双语人名对齐

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics