[1] BAHDANAU D, CHO K, BENGIO Y. Neural machine translation by jointly learning to align and translate[J]. arXiv:1409.0473, 2014.
[2] 章钧津, 田永红, 宋哲煜, 等. 神经机器翻译综述[J]. 计算机工程与应用, 2024, 60(4): 57-74.
ZHANG J J, TIAN Y H, SONG Z Y, et al. Survey of neural machine translation[J]. Computer Engineering and Applications, 2024, 60(4): 57-74.
[3] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems, 2017: 5998-6008.
[4] HADDOW B, BAWDEN R, BARONE A, et al. Survey of low-resource machine translation[J]. Computational Linguistics, 2022, 48(3): 673-732.
[5] 李洪政, 冯冲, 黄河燕. 稀缺资源语言神经网络机器翻译研究综述[J]. 自动化学报, 2021, 47(6): 1217-1231.
LI H Z, FENG C, HUANG H Y. A survey on low-resource neural machine translation[J]. Acta Automatica Sinica, 2021, 47(6): 1217-1231.
[6] MAIMAITI M, LIU Y, LUAN H, et al. Data augmentation for low‐resource languages NMT guided by constrained sampling[J]. International Journal of Intelligent Systems, 2022, 37(1): 30-51.
[7] DING L, PENG K, TAO D . Improving neural machine translation by denoising training[J]. arXiv:2201.07365, 2022.
[8] XIA Y, HE D, QIN T, et al. Dual learning for machine translation[C]//Proceedings of the 30th International Conference on Neural Information Processing Systems, 2016: 820-828.
[9] XU H, VAN DURME B, MURRAY K. BERT, mBERT, or BiBERT? a study on contextualized embeddings for neural machine translation[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2021: 6663-6675.
[10] LI P, LI L, ZHANG M, et al. Universal conditional masked language pre-training for neural machine translation[J]. arXiv:2203.09210, 2022.
[11] ZHANG M, XU J. Byte-based multilingual NMT for endangered languages[C]//Proceedings of the 29th International Conference on Computational Linguistics, 2022: 4407-4417.
[12] GU J, WANG Y, CHEN Y, et al. Meta-learning for low-resource neural machine translation[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2018: 3622-3631.
[13] LAI W, CHRONOPOULOU A, FRASER A. m4Adapter: multilingual multi-domain adaptation for machine translation with a meta-adapter[J]. arXiv:2210.11912, 2022.
[14] ZOPH B, YURET D, MAY J, et al. Transfer learning for low-resource neural machine translation[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2016: 1568-1575.
[15] RADFORD A, WU J, CHILD R, et al. Language models are unsupervised multitask learners[J]. OpenAI Blog, 2019, 1(8): 9.
[16] DEVLIN J, CHANG M W, LEE K, et al. BERT: pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of the Conference of the North, 2019: 4171-4186.
[17] LIU Y, OTT M, GOYAL N, et al. RoBERTa: a robustly optimized BERT pretraining approach[J]. arXiv:1907.11692, 2019.
[18] LAMPLE G, CONNEAU A. Cross-lingual language model pretraining[J]. arXiv:1901.07291, 2019.
[19] YANG Z, DAI Z, YANG Y, et al. XLNet: generalized autoregressive pretraining for language understanding[C]//Proceedings of the Neural Information Processing Systems, 2019: 5753-5763.
[20] GOODFELLOW I, MIRZA M, XIAO D, et al. An empirical investigation of catastrophic forgetting in gradient?based neural networks[J]. arXiv:1312.6211, 2013.
[21] ZHU J, XIA Y, WU L, et al. Incorporating BERT into neural machine translation[J]. arXiv:2002.06823, 2020.
[22] ZHANG Z, WU S, JIANG D, et al. BERT-JAM: boosting BERT-enhanced neural machine translation with joint attention[J]. arXiv:2011.04266, 2020.
[23] HWANG S J, JEONG C S. Integrating pre-trained language model into neural machine translation[J]. arXiv:2310.19680, 2023.
[24] HU E J, SHEN Y, WALLIS P, et al. LoRA: low-rank adaptation of large language models[J]. arXiv:2106.09685, 2021.
[25] IMAMURA K, SUMITA E. Recycling a pre-trained BERT encoder for neural machine translation[C]//Proceedings of the 3rd Workshop on Neural Generation and Translation, 2019: 23-31.
[26] GUO J, ZHANG Z, XU L, et al. Incorporating BERT into parallel sequence decoding with adapters[C]//Proceedings of the Neural Information Processing Systems, 2020: 10843-10854.
[27] SUN Z, WANG M, LI L. Multilingual Translation via grafting pre-trained language models[C]//Proceedings of the Association for Computational Linguistics, 2021: 2735-2747.
[28] MA S, DONG L, HUANG S, et al. DeltaLM: encoder-decoder pre-training for language generation and translation by augmenting pretrained multilingual encoders[J]. arXiv: 2106. 13736, 2021.
[29] WENG R, YU H, LUO W, et al. Deep fusing pre-trained models into neural machine translation[C]//Proceedings of the AAAI Conference on Artificial Intelligence, 2022: 11468-11476.
[30] DUAN S, ZHAO H. Encoder and decoder, not one less for pre-trained language model sponsored NMT[C]//Findings of the Association for Computational Linguistics: ACL 2023, 2023: 3602-3613.
[31] DAI Y, SHAROFF S, KAMPS M. Syntactic knowledge via graph attention with bert in machine translation[J]. arXiv:2305.13413, 2023.
[32] 占思琦, 徐志展, 杨威, 等. 基于深度编码注意力的XLNet-Transformer汉-马低资源神经机器翻译优化方法[J/OL]. 计算机应用研究, 2024, 41(3): 799-804.
ZHAN S Q, XU Z Z, YANG W, et al. XLNet-transformer optimization method for Chinese-Malay low-resource neural machine translation based on deep coded attention[J]. Application Research of Computers, 2024, 41(3): 799-804.
[33] YANG J, WANG M, ZHOU H, et al. Towards making the most of BERT in neural machine translation[C]//Proceedings of the AAAI Conference on Artificial Intelligence, 2020: 9378-9385.
[34] CHEN Y C, GAN Z, CHENG Y, et al. Distilling knowledge learned in bert for text generation[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2019: 7893-7905.
[35] WU X, XIA Y, ZHU J, et al. A study of BERT for context-aware neural machine translation[J]. Machine Learning, 2022, 111(3): 917-935.
[36] 朱志国, 郭军军, 余正涛. 一种Mask交互融合预训练知识的低资源神经机器翻译方法[J]. 小型微型计算机系统, 2024, 45(3): 591-597.
ZHU Z G, GUO J J, YU Z T. Low-resource neural machine translation method based on mask interactive fusion of pre-trained knowledge[J]. Journal of Chinese Computer Systems, 2024, 45(3): 591-597.
[37] 张迎晨, 高盛祥, 余正涛, 等. 融合BERT与词嵌入双重表征的汉越神经机器翻译方法[J]. 计算机工程与科学, 2023, 45(3): 546-553.
ZHANG Y C, GAO S X, YU Z T, et al. A Chinese-Vietnamese neural machine translation method using the dual representation of BERT and word embedding[J]. Computer Engineering and Science, 2023, 45(3): 546-553.
[38] EDUNOV S, BAEVSKI A, AULI M. Pre-trained language model representations for language generation[C]//Proceedings of the 2019 Conference of the North, 2019: 4052-4059.
[39] SENNRICH R, HADDOW B, BIRCH A. Neural machine translation of rare words with subword units[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 2016: 1715-1725.
[40] MIN Z. Attention link: an efficient attention-based low resource machine translation architecture[J]. Procedia Computer Science, 2023, 222: 284-292.
[41] LI J, MENG F, LIN Z, et al. Neutral utterances are also causes: enhancing conversational causal emotion entailment with social commonsense knowledge[C]//Proceedings of the 31st International Joint Conference on Artificial Intelligence, 2022: 4184-4190.
[42] WU Z, WU L, MENG Q, et al. UniDrop: a simple yet effective technique to improve transformer without extra cost[C]//Proceedings of the Conference on North American Chapter of the Association for Computational Linguistics, 2021: 3865-3878. |