Chinese Named Entity Recognition Based on Gated Multi-Feature Extractors

doi:10.3778/j.issn.1002-8331.2009-0363

Abstract

Abstract: Without introducing other auxiliary features, only focusing on the text, it constructs multiple feature extractors to capture more abstract, deeper, and higher-dimensional features of the text sequence. It uses the BERT pre-training model to obtain more rich information of word embedding. Word embedding is input into BiLSTM and IDCNN respectively for the first round of feature extraction. In order to obtain higher-dimensional features, transmitting information on multi-channel and control the flow, a gating mechanism is introduced in the IDCNN. In order to improve the efficiency of feature extraction, multi-head self-attention mechanism is added. It constructs share-BiLSTM, realizes the interactive circulation of features, improves the strength of feature representation. It creates two CRF to enrich feature distribution and cross-layer transmission, to promote the accuracy of predicting tag sequence. Tested on two data sets and compared with four NER models, the results show that the F1 value has been improved to a certain extent.

Key words: feature extraction, word embedding, gating mechanism, share-BiLSTM, multi-heads self-attention

摘要： 在不引入其他辅助特征的情况下，仅关注文本自身，通过构建多个特征提取器深度挖掘文本序列抽象、深层、高维的特征。采用BERT预训练模型获取信息更丰富的词嵌入；将词嵌入分别输入到BiLSTM和IDCNN中进行第一轮的特征提取，为获取更高维的特征，实现信息的多通道传输和流量控制，在IDCNN网络中引入门控机制；为提高特征提取效率，加入多头自注意力机制；构建共享BiLSTM，实现特征信息的交互流通，提高特征表征强度；创建两个CRF模型，丰富特征分布并实现特征信息的跨层传输，以提升标签序列预测的准确性。在两个数据集上进行测试，与四种NER模型进行比较，结果表明，F1值在一定程度上得到提升。

关键词: 特征提取, 词嵌入, 门控机制, 共享BiLSTM, 多头自注意力

YANG Rongying, HE Qing, DU Nisuo. Chinese Named Entity Recognition Based on Gated Multi-Feature Extractors[J]. Computer Engineering and Applications, 2022, 58(8): 117-124.

杨荣莹, 何庆, 杜逆索. 门控多特征提取器的中文命名实体识别[J]. 计算机工程与应用, 2022, 58(8): 117-124.

References

[1] COWIE J，LEHNERT W.Information extraction[J].Commnications of the ACM，1996，39（1）：80-91.
[2] HOCHRCITER S.Untersuchungen zu dynamisch neuronalen Netzen[D].Technische Uniersitat Muchen，1991.
[3] HOCHREITER S，SCHNIDHUBER J.Long short-term memory[J].Neural Computation，1997，9（8）：1725-1780.
[4] CHO K，VAN M B，GULCEHRE C，et al.Learning phrase representations using RNN encoder-decoder for statistical machine translation[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing，Doha，Qatar，October 25-29，2014：1724-1734.
[5] COLLOBERT R，WESTON J，BOTTOU L，et al.Natural language processing（almost） from scratch[J].Journal of Machine Learning Research，2011，12（8）：2492-2537.
[6] HUANG Z，XU W，YU K.Bidirectional LSTM-CRF models for sequence tagging[J].arXiv：1508.01991，2015.
[7] 赵丰，黄健，张中杰.LAC-DGLU：基于CNN和注意力机制的命名实体识别模型[J].计算机科学，2020，47（11）：212-219.
ZHAO F，HUANG J，ZHANG Z J.LAC-DGLU：named entiy recognition model based on CNN and attention mechanism[J].Computer Science，2020，47（11）：212-219.
[8] CAO P F，CHEN Y B，LIU K，et al.Adversarial transfer learning for Chinese named entity recognition with self-attention mechanism[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing，Brussels，Belgium，2018：182-192.
[9] WU F，LIU J，WU C，et al.Neural chinese named entity recognition via CNN-LSTM-CRF and joint training with word segmentation[J].arXiv：1905.01964，2019.
[10] XUAN Z Y，BAO R，MA C Y，et al.FGN：fusion glyph network for Chinese named entity recognition[J].arXiv：2001.05272，2020.
[11] ZHOU Y，ZHENG X Q，HUANG X J.Chinese named entity recognition augmented with lexicon memory[J].arXiv：1912.08282，2019.
[12] 李健龙，王盼卿，韩琪羽.基于双向LSTM的军事命名实体识别[J].计算机工程与科学，2019，41（4）：713-718.
LI J L，WANG P Q，HAN Q Y.Military named entity recognition based on bidirectional LSTM[J].Computer Engineering & Science，2019，41（4）：713-718.
[13] ZHU Y Y，WANG G X，KARLSSON B F.CAN-NER：convolutional attention network for Chinese named entity recognition[J].arXiv：1904.02141，2019.
[14] DEVLIN J，CHUANG M，LEE K，et al.BERT：pre-training of deep bidirectional transformers for language understanding[J].Computation and Language，2018：1810-4805.
[15] YU F，KOLTUN V.Multi-scale context aggregation by dilated convolutions[J].arXiv：1511.07122，2015.
[16] STRUBELL E，VERGA P，BELANGER D，et al.Fast and accurate entity recognition with iterated dilated convolutions[J].arXiv：1702.02098，2017.
[17] VASWANI A，SHAZEER N，PARMAR N，et al.Attention is all you need[C]//Proceedings of the 31st Annual Conference on Neural Information Processing Systems，Long Beach，USA，2017：5998-6008.
[18] 毛明毅，吴晨，钟义信，等.加入自注意力机制的BERT命名实体识别模型[J].智能系统学报，2020，15（4）：1-8.
MAO M Y，WU C，ZHONG Y X，et al.BERT named entity recognition model with self-attention mechanism[J].CAAI Transactions on Intelligent Systems，2020，15（4）：1-8.
[19] LEVOW G.The third international Chinese language processing bakeoff：word segmentation and named entity recognition[C]//Proceedings of the Fifth SIGHAN Workshop on Chinese Language Processing，Sydney，2006：108-117.
[20] ZHUANG Y，YANG J.Chinese NER using lattice LSTM[C]//Proceedings of Annual Meeting of the Association for Computational Linguistics（ACL），2018：1554-1564.
[21] GUI T，MA R T，ZHANG Q，et al.CNN-based Chinese NER with lexicon rethinking[C]//Proceedings of Twenty-Eighth International Joint Conference on Artificial Intelligence（IJCAI-19），Macao，China，2019：4982-4988.
[22] XUE M G，YU B W，LIU T W，et al.Porous lattice-based transformer encoder for Chinese NER[J].arXiv：1911.
02733v1，2019.