Survey for Uyghur Morphological Analysis

doi:10.3778/j.issn.1002-8331.2103-0278

Computer Engineering and Applications ›› 2021, Vol. 57 ›› Issue (15): 42-61.DOI: 10.3778/j.issn.1002-8331.2103-0278

Previous Articles Next Articles

Survey for Uyghur Morphological Analysis

LIU Chang, Abudukelimu·Abulizi, YAO Dengfeng, Halidanmu·Abudukelimu

1.Department of Information Management, Xinjiang University of Finance and Economics, Urumqi 830012, China
2.Institute of Silk Road Economy and Management, Xinjiang University of Finance and Economics, Urumqi 830012, China
3.Beijing Key Laboratory of Information Service Engineering, Beijing Union University, Beijing 100101, China

Online:2021-08-01 Published:2021-07-26

维吾尔语形态分析研究综述

刘畅，阿布都克力木·阿布力孜，姚登峰，哈里旦木·阿布都克里木

1.新疆财经大学信息管理学院，乌鲁木齐 830012
2.新疆财经大学丝路经济与管理研究院，乌鲁木齐 830012
3.北京联合大学北京市信息服务工程重点实验室，北京 100101

Abstract

Abstract:

Uyghur has the characteristics of morphological richness, agglutinative and data sparsity. There is a big gap between Uyghur and popular languages such as English and Chinese in processing technologies, which cannot meet the development needs of Xinjiang. Morphological analysis is an important part of natural language processing, and the study of Uyghur morphological analysis is significant to promote the development of Uyghur language information processing technology. This paper introduces Uyghur grammar, describes the research status of Uyghur natural language processing, morphological analysis and their related basic resources, divides common methods into five categories：rule-based, dictionary-based, statistics-based, deep learning-based and hybrid-based, analyzes the advantages and disadvantages of each method, introduces the follow-up research of Uyghur morphological analysis, draws lessons from the advanced lexical analysis methods, finally summarizes the challenges and opportunities faced by Uyghur morphological analysis, and looks forward to its future development trend.

Key words: Uyghur, natural language processing, morphological analysis, phonetic restoration, stemming, morphological segmentation

摘要：

维吾尔语具有形态丰富性、黏着性和数据稀疏性等特点，处理技术和英汉等热门语言有着较大差距并且未能满足新疆地区发展需求。形态分析是自然语言处理的重要组成部分，研究维吾尔语形态分析对于推动维吾尔语信息处理技术发展有着重要意义。简述了维吾尔语语法，描述了维吾尔语自然语言处理、形态分析及其相关基本资源研究现状，将常见方法分为基于规则、基于词典、基于统计、基于深度学习和基于混合5大类并分析了各种方法的优劣，介绍了维吾尔语形态分析后续研究，借鉴了先进的词法分析方法，总结了维吾尔语形态分析面临的挑战和机遇，并对其未来发展趋势进行展望。

关键词: 维吾尔语, 自然语言处理, 形态分析, 音变还原, 词干提取, 形态切分

LIU Chang, Abudukelimu·Abulizi, YAO Dengfeng, Halidanmu·Abudukelimu. Survey for Uyghur Morphological Analysis[J]. Computer Engineering and Applications, 2021, 57(15): 42-61.

刘畅，阿布都克力木·阿布力孜，姚登峰，哈里旦木·阿布都克里木. 维吾尔语形态分析研究综述[J]. 计算机工程与应用, 2021, 57(15): 42-61.

[1]	LIU Bowen, FAN Chunxiao. Relation Extraction Based on CapsuleNet via Position Perception [J]. Computer Engineering and Applications, 2021, 57(6): 101-107.
[2]	LIAO Wenxiong, ZENG Bi, XU Yayun. Natural Language Processing Model Based on One-Dimensional Dilated Convolution and Attention Mechanism [J]. Computer Engineering and Applications, 2021, 57(4): 114-119.
[3]	Hasan Wumaier, Sirajahmat Ruzmamat, Xireaili Hairela, LIU Wenqi, Tuergen Yibulayin, WANG Liejun, Wayit Abulizi. Bi-directional Uyghur-Chinese Neural Machine Translation with Marked Syllables [J]. Computer Engineering and Applications, 2021, 57(4): 161-168.
[4]	JIANG Yangyang, JIN Bo, ZHANG Baochang. Research Progress of Natural Language Processing Based on Deep Learning [J]. Computer Engineering and Applications, 2021, 57(22): 1-14.
[5]	YUAN Xun, LIU Rong, LIU Ming. Aspect-Level Sentiment Analysis Model Incorporating Multi-layer Attention [J]. Computer Engineering and Applications, 2021, 57(22): 147-152.
[6]	YANG Quan. SVM Algorithm for N1+N2 Structure Syntax Relation Determination [J]. Computer Engineering and Applications, 2021, 57(20): 104-108.
[7]	JIAO Kainan, LI Xin, ZHU Rongchen. Overview of Chinese Domain Named Entity Recognition [J]. Computer Engineering and Applications, 2021, 57(16): 1-15.
[8]	LI Zhi, WANG Zhen, YANG Fugeng, Xi Xuefeng. Research and Prospect of Automatic Question Answer Based on Table [J]. Computer Engineering and Applications, 2021, 57(13): 67-76.
[9]	BAO Yue, LI Yanling, LIN Min. Review of Extractive Machine Reading Comprehension [J]. Computer Engineering and Applications, 2021, 57(12): 25-36.
[10]	HE Yujie, DU Fang, SHI Yingjie, SONG Lijuan. Survey of Named Entity Recognition Based on Deep Learning [J]. Computer Engineering and Applications, 2021, 57(11): 21-36.
[11]	HAO Chao, QIU Hangping, SUN Yi, ZHANG Chaoran. Research Progress of Multi-label Text Classification [J]. Computer Engineering and Applications, 2021, 57(10): 48-56.
[12]	SUN Linghao. Cross-Lingual Chinese Named Entity Recognition Based on Translation Model [J]. Computer Engineering and Applications, 2021, 57(10): 94-100.
[13]	YU Tongrui, JIN Ran, HAN Xiaozhen, LI Jiahui, YU Ting. Review of Pre-training Models for Natural Language Processing [J]. Computer Engineering and Applications, 2020, 56(23): 12-22.
[14]	WU Cheng, WANG Chaokun, WANG Muxian. Entity Attributes Extraction Based on Text Simplification [J]. Computer Engineering and Applications, 2020, 56(21): 115-122.
[15]	TU Wenbo, YUAN Zhenming, YU Kai. Convolutional Neural Networks Without Pooling Layer for Chinese Word Segmentation [J]. Computer Engineering and Applications, 2020, 56(2): 120-126.

Survey for Uyghur Morphological Analysis

维吾尔语形态分析研究综述

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics