Study on Text Classification Method of BERT-TECNN Model

doi:10.3778/j.issn.1002-8331.2011-0147

Abstract

Abstract:

Due to Bert-base, the parameters of Chinese pre-training model are huge, the internal parameters change little during the fine-tuning of classification task, which is prone to overfitting phenomenon and weak generalization ability. Moreover, this model is pre-training in the unit of words and contains less information of words. To solve these problems, this study proposes the BERT-TECNN model, model uses BERT-base, the Chinese model as a dynamic character vector model to output characteristic information containing the depth of character vector. The transformer encoder layer again long since the attention for the data is calculated, to extract the feature information, in order to improve the generalization ability of the model, CNN layer with different size of convolution kernels, capture different length in each data word information, finally softmax is used for classification. Compared with Word2Vec+CNN, Word2Vec+BiLSTM, Elmo+CNN, BERT+CNN, BERT+BiLSTM, BERT+Transformer and other deep learning text classification models on three data sets, and the accuracy, precision, recall rate and F1 measure values are all the highest. Experiments show that the model can effectively extract the feature information of words in text, optimize the problem of overfitting, and improve the generalization ability.

Key words: bert, transformer, encoder, CNN, text classification, fine-tuning, self-attention, overfit

摘要：

由于Bert-base，Chinese预训练模型参数巨大，在做分类任务微调时内部参数变化较小，易产生过拟合现象，泛化能力弱，且该模型是以字为单位进行的预训练，包含词信息量较少。针对这些问题，提出了BERT-TECNN模型，模型使用Bert-base，Chinese模型作为动态字向量模型，输出包含深度特征信息的字向量，Transformer encoder层再次对数据进行多头自注意力计算，提取特征信息，以提高模型的泛化能力，CNN层利用不同大小卷积核，捕捉每条数据中不同长度词的信息，最后应用softmax进行分类。该模型与Word2Vec+CNN、Word2Vec+BiLSTM、Elmo+CNN、BERT+CNN、BERT+BiLSTM、BERT+Transformer等深度学习文本分类模型在三种数据集上进行对比实验，得到的准确率、精确率、召回率、F1测度值均为最高。实验表明该模型有效地提取了文本中字词的特征信息，优化了过拟合问题，提高了泛化能力。

关键词: bert, transformer, encoder, CNN, 文本分类, fine-tuning, self-attention, 过拟合

LI Tiefei, SHENG Long, WU Di. Study on Text Classification Method of BERT-TECNN Model[J]. Computer Engineering and Applications, 2021, 57(18): 186-193.

李铁飞，生龙，吴迪. BERT-TECNN模型的文本分类方法研究[J]. 计算机工程与应用, 2021, 57(18): 186-193.

[1]	XU Hao, ZHANG Kai, TIAN Yingjie, CHONG Faguang, WANG Zichao. Review of Deep Neural Network-Based Image Caption [J]. Computer Engineering and Applications, 2021, 57(9): 9-22.
[2]	WU Wenlong, ZHOU Xi, WANG Yi, WANG Baoquan. WKAG：Fraud Detection Method for Imbalanced Medical Insurance Data [J]. Computer Engineering and Applications, 2021, 57(9): 247-254.
[3]	LIANG Fangxuan, YANG Feng, LU Liyun, YIN Mengxiao. Review of Brain Tumor Segmentation Methods Based on Convolutional Neural Networks [J]. Computer Engineering and Applications, 2021, 57(7): 34-43.
[4]	LI Jian, SUN Dasong, ZHANG Beiwei. Image Restoration Using Dual-Encoder and Adversarial Training [J]. Computer Engineering and Applications, 2021, 57(7): 192-197.
[5]	LI Xianguo, FENG Xinxin, LI Jianxiong. Sigle Image Super-Resolution Reconstruction Based on Multi-scale Residual Network [J]. Computer Engineering and Applications, 2021, 57(7): 215-221.
[6]	HUO Guangyu, ZHANG Yong, SUN Yanfeng, YIN Baocai. Research on Archive Data Intelligent Classification Based on Semantic [J]. Computer Engineering and Applications, 2021, 57(6): 247-253.
[7]	HUANG Jinjie, LIN Jiangquan, HE Yongjun, HE Jinjie, WANG Yajun. Chinese Short Text Classification Algorithm Based on Local Semantics and Context [J]. Computer Engineering and Applications, 2021, 57(6): 94-100.
[8]	LI Songjiang, WU Ning, WANG Peng, LI Hailan. Vehicle Target Detection Method Based on Improved Cascade RCNN [J]. Computer Engineering and Applications, 2021, 57(5): 123-130.
[9]	LYU Hao, ZHANG Shengbing, WANG Jia, LIU Shuo, JING Desheng. Implementation of Convolutional Neural Network SIP Microsystem [J]. Computer Engineering and Applications, 2021, 57(5): 216-221.
[10]	HAN Wenjing, LUO Xiaoshu, YANG Rixing. Research on Compound Gesture Recognition Method [J]. Computer Engineering and Applications, 2021, 57(4): 108-113.
[11]	WEN Jiebin, YANG Wenzhong, MA Guoxiang, ZHANG Zhihao, LI Hailei. Micro-expression Recognition Based on Apex Frame Optical Flow and Convolutional Autoencoder [J]. Computer Engineering and Applications, 2021, 57(4): 127-133.
[12]	ZHENG Cheng, DONG Chunyang, HUANG Xiayan. Short Text Classification Method Based on BTM Graph Convolutional Network [J]. Computer Engineering and Applications, 2021, 57(4): 155-160.
[13]	WANG Yutan, ZHU Chaowei, ZHAO Chen, LI Lekai, LI Ping, FENG Zhaoxu, XUE Junrui, LI Jiajing, ZHANG Jiaxin. Image Detection Method of Lingwu Long Jujube Based on Faster R-CNN [J]. Computer Engineering and Applications, 2021, 57(4): 216-224.
[14]	WAN Yaling, ZHONG Xiwu, LIU Hui, QIAN Yurong. Survey of Application of Convolutional Neural Network in Classification of Hyperspectral Images [J]. Computer Engineering and Applications, 2021, 57(4): 1-10.
[15]	ZHAO Hongrui, XUE Lei. Research on Stock Forecasting Based on LSTM-CNN-CBAM Model [J]. Computer Engineering and Applications, 2021, 57(3): 203-207.

Study on Text Classification Method of BERT-TECNN Model

BERT-TECNN模型的文本分类方法研究

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics