Computer Engineering and Applications ›› 2013, Vol. 49 ›› Issue (17): 195-198.

Improved connectivity-based layout segmentation method

YU Ming, GUO Qian, WANG Dongzhuang, YU Yang   

  1. Department of Computer Science and Software, Hebei University of Technology, Tianjin 300401, China
  • Online:2013-09-01 Published:2013-09-13


于  明,郭  佥,王栋壮,于  洋   

  1. 河北工业大学 计算机科学与软件学院,天津 300401

Abstract: Layout segmentation is an important part of layout analysis. An improved page segmentation algorithm based on connective region which can effectively and quickly split the more complex images is presented, which is effective in reducing segmentation errors in the threshold of the original algorithm. The regions of the single text font have been expanded in the text image, so that the follow-up statistics in the distance between the connected regions is more accurate and convenient. The larger connected regions are formed by smearing the image through the statistic in the distance between the connected regions, and the regions of the text image are segmented. Experimental results show that the method is accurate, efficient and applicable, and it is superior for the more complex layouts segmentation.

Key words: layout segmentation, layout analysis, word expansion, connected domain

摘要: 版面分割是版面分析的重要组成部分,经过大量的研究,如今已到了一个比较成熟的阶段。对基于连通域的版面分割算法进行了改进,能有效快速地分割较为复杂的版面图像,同时有效减少原有算法中阈值引起的分割错误的情况。先对文本图像进行单个字体的区域扩充,使后续的连通间距统计更为准确和方便,再通过连通间距的统计对图像进行模糊整合,进行文本图像的连通区域分割。实验结果表明,改进的基于连通域的算法分割版面准确,速度快,适用范围广,对于较为复杂的版面分割更具有优越性。

关键词: 版面分割, 版面分析, 单字扩充, 连通域