Computer Engineering and Applications ›› 2011, Vol. 47 ›› Issue (10): 164-168.

• 图形、图像、模式识别 • Previous Articles     Next Articles

Precise text location based on region search and invariable moment classification

ZHOU Huican1,LIU Qiong1,WANG Yaonan2   

  1. 1.College of Computer Science and Technology,Hunan University of Art and Science,Changde,Hunan 415000,China
    2.College of Electrical and Information Engineering,Hunan University,Changsha 410082,China
  • Received:1900-01-01 Revised:1900-01-01 Online:2011-04-01 Published:2011-04-01

基于区域搜索与矩特征分类的文本精确定位

周慧灿1,刘 琼1,王耀南2   

  1. 1.湖南文理学院 计算机科学与技术学院,湖南 常德 415000
    2.湖南大学 电气与信息工程学院,长沙 410082

Abstract: A new method for text location in images based on searching regions of specified color distribution is brought forward.Because of text usually presents as monadic color and is surrounded by different background colors,new method searches the blocks in an image with a monadic color to find candidate text regions which is surrounded by other different colors.After regions merging and splitting,moment invariant features and Support Vector Machine(SVM) classification are adopted to filter the candidate regions ulteriorly.Profited from using features in spatial distribution of color,the new method overcomes the problem caused by features overlapping in shape and texture between texts and other elements,and also guarantees its flexibility.Furthermore,feature extraction based on precise region search remarkably decreases the difficulty with training classifier.That makes the SVM classifier season with variations of background and text size,partly shielded and other complex conditions easily.Experiments indicate that the method has preferable environment suitability and high accuracy.

Key words: text region search, text location, moment invariant features, Support Vector Machine(SVM)

摘要: 提出一种基于特定颜色分布区域搜索的文本定位方法,利用文字通常呈现为单一的颜色被不同的背景颜色包围的特点,以单一的颜色作为依据,搜索被包围的文本候选区域;然后,在区域合并与分离算法的基础上,利用不变矩特征和支持向量机(SVM)分类器实现候选区域的进一步筛选。与一般基于形状和纹理的方法相比,由于采用了文字颜色的空间分布特征,避开了文本与其他元素的形状和纹理特征交错问题,保证了算法适应性。基于精确区域搜索的不变矩特征提取,降低了分类器的训练难度,使分类器能很好地适应背景和文字尺寸变化以及部分遮挡等复杂情形。实验表明,该方法具有较好的复杂环境适应性和非常高的准确性。

关键词: 文本区域搜索, 文本定位, 不变矩特征, 支持向量机