Recent Advances and Challenges on Grammatical Error Correction in Natural Language

ZHANG Ming, LU Qinghua, HUANG Yuanzhong, LI Ruixuan   

  1. 1.School of Computer Science & Technology, Huazhong University of Science and Technology, Wuhan 430074, China
    2.Department of Muyu, Shenzhen?Fangzhi?Science?&?Technology?Co., Ltd., Shenzhen, Guangdong 518000, China
  1. 1.华中科技大学 计算机科学与技术学院,武汉 430074
    2.深圳市方直科技股份有限公司 木愚部,广东 深圳 518000

Abstract: Grammatical error correction(GEC) is an important application in field of natural language processing, and obtains advances and various achievements in near recent years. This paper deeply surveys into GEC to better understand the related state-of-the-art development, challenges and future directions. Basis perception and research situations of GEC are introduced, and then several key researches are analyzed, in aspects of data processing, algorithm models and result evaluation. Specifically, Chinese GEC research progresses are described. This paper also summarizes resources related to GEC, including papers, open implementation and publicly corpora. In addition, issues and challenges about GEC are discussed in the end.

Key words: grammatical error correction, machine translation, transfer learning, multi-lingual model

摘要: 语法纠错(grammatical error correction,GEC)是自然语言处理领域的重要应用之一,在近几年取得了较大的进展和丰富的研究成果。对语法纠错研究进行了深入调研,旨在更好地了解当前的研究进展、面对的挑战和未来发展趋势。介绍了语法纠错的基本含义和研究概况,分析了语法纠错领域的重要研究进展,对数据处理方法、算法模型和GEC评估方法等关键方法分别做了探讨,并概括了中文语法纠错的研究状况。总结了语法纠错研究的相关资源,主要包括文献资源、开源应用和公开数据,并讨论了GEC面临的问题和挑战。

关键词: 语法纠错, 机器翻译, 迁移学习, 多语言模型