计算机工程与应用 ›› 2020, Vol. 56 ›› Issue (9): 213-220.DOI: 10.3778/j.issn.1002-8331.1911-0032

• 图形图像处理 • 上一篇    下一篇

基于YOLO的自然场景倾斜文本定位方法研究

周翔宇,高仲合   

  1. 曲阜师范大学 软件学院,山东 曲阜 273100
  • 出版日期:2020-05-01 发布日期:2020-04-29

Research on Inclined Text Location Method of Natural Scene Based on YOLO

ZHOU Xiangyu, GAO Zhonghe   

  1. College of Software, Qufu Normal University, Qufu, Shandong 273100, China
  • Online:2020-05-01 Published:2020-04-29

摘要:

为了提升倾斜文本区域定位的准确度,提出了一种基于YOLO算法改进的YOLO_BOX定位模型。设置不同尺寸的anchor对图片进行训练,且定义LOSS损失函数训练预测模型;使用K-means算法对box进行聚类,并利用NMS方法进行多余候选框过滤;利用Angle Correct算法对聚类后的box进行灰度化处理,通过计算像素灰度值的方差来得到文字的倾斜角度并进行角度矫正。实验结果表明,优化后的YOLO_BOX定位模型在ICDAR2015数据集上,对自然场景中倾斜文本区域的定位中具有较高的准确率和召回率。

关键词: 深度学习, 卷积神经网络, 目标检测, 倾斜文本定位, 聚类

Abstract:

In order to improve the accuracy of tilted text region location, a YOLO_BOX location model based on improved YOLO algorithm is proposed. Anchors of different sizes are set to train the images, and LOSS function is defined to train the prediction model. K-means algorithm is adopted to cluster the boxes, and NMS method is applied to filter the redundant candidate boxes. Angle Correct algorithm is used to gray the clustered box, and the variance of the gray value of the pixels is calculated to get the tilt angle of the text and correct the angle. Experimental results show that the optimized YOLO_BOX localization model has high accuracy and recall rate for the inclined text area location in the natural scene on ICDAR2015 data set.

Key words: deep learning, Convolutional Neural Network(CNN), object detection, tilt text region location, clustering