计算机工程与应用 ›› 2024, Vol. 60 ›› Issue (1): 263-270.DOI: 10.3778/j.issn.1002-8331.2208-0247

• 图形图像处理 • 上一篇    下一篇

基于高阶图卷积推理网络的任意形状文本检测

刘平,姜永峰,张良   

  1. 1.中国民航大学 电子信息与自动化学院,天津 300300
    2.浙江省温州市公安局 情报信息支队,浙江 温州 325000
  • 出版日期:2024-01-01 发布日期:2024-01-01

Arbitrary Shape Text Detection Based on High-Order Graph Convolution Reasoning Network

LIU Ping, JIANG Yongfeng, ZHANG Liang   

  1. 1.College of Electronic Information and Automation, Civil Aviation University of China, Tianjin 300300, China
    2.Information Detachment, Wenzhou Public Security Bureau, Zhejiang Province, Wenzhou, Zhejiang 325000, China
  • Online:2024-01-01 Published:2024-01-01

摘要: 通用场景文本检测被广泛应用于地图导航、无人驾驶等多个领域。场景文本方向各异且形状复杂多变,使得文本检测难度大。针对这一问题,提出一种高阶图卷积推理网络。以文本检测框架DRRG为基础,设计高阶图方案,提出高阶图卷积推理网络,扩展了推理范围,有效组合高阶邻居提供的辅助信息。改进一阶邻居的设置,降低无关组件的干扰,提高了反向传播和组件链接的效率。引入SE聚合模块为每个节点独立且自适应地生成聚合方案,进一步提高了对高阶信息的利用率。实验结果表明,改进后的网络在Total-Text、CTW-1500和ICDAR2015数据集上的平均精度(F1)分别提升了1.4、1.05和1.26个百分点。

关键词: 图像处理, 文本检测, 高阶图卷积网络, 关系推理网络, SE聚合

Abstract: General scene text detection is widely used in many fields, such as map navigation, driverless and so on. Scene text has different directions and complex shapes, which makes text detection difficult. To solve this problem, it puts forward a kind of high-order graph convolution relation reasoning network. Firstly, it designs the scheme of high-order graph based on the text detection framework DRRG, and proposes the reasoning network of high-order graph convolution, which expands the reasoning range and effectively combines the assistant information provided by high-order neighbors. Secondly, it makes better the setting of first-order neighbors to reduce the interference of irrelevant components, and improves the efficiency of back-propagation and component link. Finally, the SE- aggregation module is introduced to generate aggregation scheme independently and adaptively for each node, which further improves the utilization of high-order information. The experimental results show that the average accuracy (F1) of the improved network on the Total-Text, CTW-1500 and ICDAR2015 datasets is improved by 1.4, 1.05 and 1.26 percentage points respectively.

Key words: image processing, text detection, high-order graph convolutional network, relational reasoning network, SE-aggregation