Computer Engineering and Applications ›› 2014, Vol. 50 ›› Issue (10): 180-183.

Previous Articles     Next Articles

Approach for text segmentation in web image

LIU Peizhong1, NING Xin2, LI Weijun2   

  1. 1.College of Engineering, Huaqiao University, Quanzhou, Fujian 362000, China
    2.Lab of Artificial Neural Networks, Institute of Semiconductors, CAS, Beijing 100083, China
  • Online:2014-05-15 Published:2014-05-14

一种网页图像文字分割方法

柳培忠1,宁  欣2,李卫军2   

  1. 1.华侨大学 工学院,福建 泉州 362000
    2.中国科学院 半导体研究所 神经网络实验室,北京 100083

Abstract: According to the features of complex web images, a text segmentation method is proposed based on the OTSU method. The image is preprocessed, thus to unify the divided character color, in addition to remove a lot of noise and improve image contrast. The position of each character area is determined based on the global threshold value. Using the optimal threshold for text image segmentation. Experimental results show that the method improves the segmentation character, based on the high accuracy, with strong robustness.

Key words: web image, OTSU method, text segmentation, image preprocessing

摘要: 针对复杂网页图像中文本的特点,提出了一种基于最大类间差法(OTSU)的文字分割方法。对原文字图像进行预处理,统一了分割后字符的颜色、去除了大量的噪声、提高了图像的对比度;在全局阈值的基础上确定了各字符区域的位置;利用局部最优阈值对文字图像进行局部分割。实验结果表明,方法在保证较高准确率的基础上,提升了分割后字符的效果,具有较强的鲁棒性。

关键词: 网页图像, 最大类间差法(OTSU), 文字分割, 图像预处理