Computer Engineering and Applications ›› 2021, Vol. 57 ›› Issue (3): 162-167.DOI: 10.3778/j.issn.1002-8331.1910-0447

Previous Articles     Next Articles

TextCNN Based Filtering Model for Barrage in Live Video Broadcast

MING Jianhua, HU Chuang, ZHOU Jianzheng, YAO Jinliang   

  1. 1.Project Department, Tian Ge Interactive Holdings Limited, Hangzhou 310105, China
    2.School of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou 310018, China
  • Online:2021-02-01 Published:2021-01-29

针对直播弹幕的TextCNN过滤模型

明建华,胡创,周建政,姚金良   

  1. 1.天鸽互动控股有限公司 项目部,杭州 310105
    2.杭州电子科技大学 计算机学院,杭州 310018

Abstract:

The rise of webcasting has made live broadcasts a new way of communication. However, there are various types of illegal barrage. In the identification of illegal barrage, manual screening is too inefficient, traditional keyword filtering methods and statistical machine learning methods have low recognition rates and cannot tackle mutated short texts. It is a very meaningful problem to make the machine more efficient and accurate to identify illegal barrage to create a better network environment. The TextCNN based method is proposed in this paper, which improves the recognition rate of illegal short text in live barrage by preprocessing the noisy short texts and mining text correlation features using text convolutional neural networks.

Key words: live barrage, text with noise, text filtering, convolutional neural network

摘要:

网络直播的兴起,促使直播弹幕成为一种新型的交流方式。随之而来的还有各类非法弹幕。在识别非法弹幕方面,人工筛选过于低效,传统关键词过滤方法和统计机器学习方法识别率较低,且无法应对变异短文本。如何让机器更高效、更准确地识别非法弹幕以营造更好的网络环境是一个很有意义的问题。提出了基于文本卷积神经网络(TextCNN)的带噪非法短文本识别方法。通过对带噪短文本的预处理以及利用文本卷积神经网络挖掘字符间的相关特征,极大地提高了直播弹幕中非法短文本的识别率。

关键词: 直播弹幕, 带噪短文本, 文本过滤, 卷积神经网络