计算机工程与应用 ›› 2014, Vol. 50 ›› Issue (1): 145-148.

• 图形图像处理 • 上一篇    下一篇

基于HMM的联机手写哈萨克文字的识别研究

达吾勒·阿布都哈依尔1,2,古丽拉·阿东别克1,2   

  1. 1.新疆大学 信息科学与工程学院,乌鲁木齐 830046
    2.新疆多语种信息技术重点实验室,乌鲁木齐 830046
  • 出版日期:2014-01-01 发布日期:2013-12-30

Study of HMM based online Kazakh handwriting recognition

Dawe1 Abilhayer1,2, Gulila Altenbek1,2   

  1. 1.College of Information Science and Engineering, Xinjiang University, Urumqi 830046, China
    2.Key Laboratory of Multilingual Information Technology of Xinjiang, Urumqi 830046, China
  • Online:2014-01-01 Published:2013-12-30

摘要: 以基于隐马尔可夫模型和统计语言模型的研究作为基础,着重研究联机手写哈萨克文的切分技术、连体段分类和特征参数的独特提取技术。系统先将提取延迟笔划后的连体段主笔划作为HMM识别器的输入,再根据被识别的主笔划的编号和延迟笔划标记从连体段分类词典中查找,找到对应的连体段识别结果。通过去除连体段延迟笔画的方法可以有效地减少需建立的模型数目,进而提高识别速度和避免由字符切分所带来的问题。

关键词: 哈萨克文, 联机手写, 隐马尔可夫模型, 连体段, 连体段分类

Abstract: Based on Hidden Markov Model(HMM) and Statistical Language Model(SLM), this paper focuses on the techniques of segmentation, word-part classification and feature extraction for Kazakh online handwriting recognition. The delayed strokes are removed from sub-words and then it takes the main strokes as the input of HMM recognizer. After that, the paper searches the result in sub-word classification dictionary according to the number of identified main strokes and delayed strokes. By removing the delayed-strokes, it can not only improve the recognition speed of system, but also can reduce the number of models and avoid some problems caused by segmentation.

Key words: Kazakh, online-handwriting, Hidden Markov Model(HMM), word-part, word-part classification