Research on Kirgiz language part of speech tagging based on HMM

Abstract

Abstract: Research on the Kirghiz information processing plays an important role to whether Xinjiang Kirghiz can enter the information age, and inherit the national culture. Based on the traditional HMM theory, this paper uses the two stage dimension method and improves the HMM parameters calculation, data-smoothing and unknown words, so it can reflect the context dependence better. Meanwhile, stem extraction algorithm, which is based on automatic words segmentation dictionary, with rules and statistics method is used for the using of Kirghiz part-of-speech tagging system. Compared to traditional HMM, the improved method is effective to enhance accuracy.

Key words: Kirghiz, automatic words segmentation dictionary, Hidden Markov Model（HMM）, part-of-speech tagging

摘要： 柯尔克孜语的语言信息处理研究，对新疆柯尔克孜族是否能跨入信息时代，传承民族文化起着至关重要的作用。采用两级标注法，基于传统的HMM理论，改进了HMM模型参数的计算、数据平滑和未登入词的处理方法，更好地体现了上下文依赖关系。同时，把基于自动分词词典的词干提取算法与规则和统计相结合的方法用于柯尔克孜语的词性标注系统上。相对于传统的HMM，改进后的方法有效提高了准确性。

关键词: 柯尔克孜语, 自动分词词典, 隐马尔可夫模型（HMM）, 词性标注

CHEN Li, Gulila·ALTENBEK. Research on Kirgiz language part of speech tagging based on HMM[J]. Computer Engineering and Applications, 2014, 50(15): 120-124.

陈莉，古丽拉·阿东别克. 基于HMM的柯尔克孜语词性标注的研究[J]. 计算机工程与应用, 2014, 50(15): 120-124.

[1]	WANG Wentao, LI Shumei, TANG Jie, LYU Weilong. DDoS Attack Detection Method Based on Probability Graph Model and DNN [J]. Computer Engineering and Applications, 2021, 57(13): 108-115.
[2]	WU Chutian, CHEN Yongle, CHEN Junjie. Cross-Protocol Anomaly Detection Algorithm Based on HMM [J]. Computer Engineering and Applications, 2020, 56(8): 81-86.
[3]	WU Xiaoquan1，2, LI Hui1，2, CHEN Mei1，2, DAI Zhenyu1，2. DRVisSys： visualization recommendation system based on attribute correlation analysis [J]. Computer Engineering and Applications, 2018, 54(7): 251-256.
[4]	PAN Li1，2, DENG Jia1, WANG Yongli1. HMM-Cluster: Trajectory clustering for discovering traffic volume overload [J]. Computer Engineering and Applications, 2018, 54(1): 77-85.
[5]	XU Chun1，2，3, YANG Yong4, JIANG Tonghai1. Research on machine translation based Uyghur morphological analysis [J]. Computer Engineering and Applications, 2017, 53(14): 138-142.
[6]	GE Yongkan, YU Fengqin . Improved speech synthesis with adaptive postfilter parameters [J]. Computer Engineering and Applications, 2017, 53(1): 168-171.
[7]	HU Yifan, HU Youbin, LI Qian, GENG Dongdong. Research on face detection, tracking and recognition system based on video surveillance [J]. Computer Engineering and Applications, 2016, 52(21): 1-7.
[8]	JIANG Fang1，2, LI Guohe1，2，3, YUE Xiang4, WU Weijiang1，2，3, HONG Yunfeng3, LIU Zhiyuan3, CHENG Yuan3. Segmentation of Chinese word based on method of rough segment and part of speech tagging [J]. Computer Engineering and Applications, 2015, 51(6): 204-207.
[9]	CHAI Qian, WANG Huiqin, LIAO Yuting, LU Ying, MA Zongfang. Flame recognition algorithm based on Hidden Markov Model and Support Vector Machines [J]. Computer Engineering and Applications, 2015, 51(13): 202-205.
[10]	LI Shuangqing, MU Shengdi. Improved DBSCAN algorithm and its application [J]. Computer Engineering and Applications, 2014, 50(8): 72-76.
[11]	ZHANG Jijun1, MA Dengwu2, WANG Lin1. Recognition and diagnosis of incipient faults in analog circuit using improved HMM [J]. Computer Engineering and Applications, 2014, 50(3): 261-264.
[12]	ZHAO Hongjian1, DA Hanqiao2. Fault diagnosis method based on HMM-LSSVM [J]. Computer Engineering and Applications, 2014, 50(19): 237-240.
[13]	YANG Qiufen1，2, GUI Weihua1, HU Huosheng1, YANG Ruoning2. Gabor wavelet optimization and HMM algorithm in eye state fatigue recognition [J]. Computer Engineering and Applications, 2014, 50(15): 13-17.
[14]	BAO Xirimo1, GAO Guanglai1, ZHANG Jing2. Genetic algorithm based optimization of acoustic model topologies [J]. Computer Engineering and Applications, 2014, 50(14): 5-8.
[15]	CHEN Ye1, WANG Zhelong1，2, WU Donghui1. Activity recognition of two-body interactions by using BSN [J]. Computer Engineering and Applications, 2014, 50(13): 1-5.

Research on Kirgiz language part of speech tagging based on HMM

基于HMM的柯尔克孜语词性标注的研究

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics