Computer Engineering and Applications ›› 2022, Vol. 58 ›› Issue (7): 43-54.DOI: 10.3778/j.issn.1002-8331.2108-0200

• Research Hotspots and Reviews • Previous Articles     Next Articles

Review of Application of Data Augmentation Strategy in English Grammar Error Correction

SUN Xiaodong,  YANG Dongqiang   

  1. School of Computer Science and Technology, Shandong Jianzhu University, Jinan 250101, China
  • Online:2022-04-01 Published:2022-04-01

数据增广策略在英语语法纠错中的应用综述

孙晓东,杨东强   

  1. 山东建筑大学 计算机科学与技术学院,济南 250101

Abstract: In recent years, the grammatical error correction as machine translation task has made significant progress in the field of English grammar correction, for data-driven natural language processing methods, large-scale, high-quality annotated data have become the most important task of translation and other related resources. In this survey, it focuses on the field of English grammar correction data sets and data augmented methods. This paper comprehensively summarizes the data sets, data synthesis, evaluation methods and application status in the field of English grammar error correction, and conducts an inductive analysis on them. Finally, the paper summarizes and prospects how to improve the performance of English grammar error correction model in the future.

Key words: data-driven, data augmentation, English grammar error correction

摘要: 近年来,将语法错误纠正当作机器翻译任务在英语语法纠错领域取得重大进展,对于数据驱动的自然语言处理方法,大规模、高质量的标注语料成为翻译等相关任务最重要的资源。在调查中,主要关注英语语法纠错领域的数据集和数据增广方法。全面地概括了英语语法纠错领域使用的数据集、数据合成、评价方法及应用现状,并对其进行归纳分析;对今后如何提高英语语法纠错模型的性能进行了总结和展望。

关键词: 数据驱动, 数据增广, 英语语法纠错