计算机工程与应用 ›› 2011, Vol. 47 ›› Issue (28): 150-153.

• 图形、图像、模式识别 • 上一篇    下一篇

粘连字符的图片验证码识别

王 璐,张 荣,尹 东,詹金春,吴陈洋   

  1. 中国科学技术大学 电子工程和信息科学系,合肥 230027
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2011-10-01 发布日期:2011-10-01

Breaking visual CAPTCHA of merged characters

WANG Lu,ZHANG Rong,YIN Dong,ZHAN Jinchun,WU Chenyang   

  1. Department of Electronic Engineering & Information Science,University of Science & Technology of China,Hefei 230027,China
  • Received:1900-01-01 Revised:1900-01-01 Online:2011-10-01 Published:2011-10-01

摘要: 验证码在维护互联网安全、防止机器恶意攻击做出了很大贡献。但通过现有的模式识别技术仍然可以破解部分验证码。着重于有粘连字符的猫扑和西祠胡同网站验证码进行识别,难点在于分割图片中的粘连字符。对字符是模糊粘连的猫扑验证码,提出了基于局部极小值和最小投影值的方法来分割;对有交错粘连的西祠胡同验证码,通过颜色聚类与竖直投影结合来达到分割字符的目的。最终均采用卷积神经网络进行训练和识别,达到了较高的识别率。

关键词: 验证码, 粘连字符, 分割, 卷积神经网络

Abstract: CAPTCHA has made great contributions on maintaining the Internet security and preventing malicious machine attack.However,through the existing pattern recognition techniques,some CAPTCHAs can also be recognized.This paper focuses on breaking mop and xicihutong web CAPTCHA with merged characters.The difficulty is the segmentation of merged characters in image.Aiming at mop CAPTCHA which has blur merged characters,the paper proposes a segmentation algorithm based on local teeny value and minimum projective value.As to xicihutong web CAPTCHA with interlaced merged characters,the color clustering is proposed by combining with the vertical projection to achieve the purpose of segmentation.It uses the convolution neural network to train and recognize them,reaching a high recognition rate.

Key words: Completely Automated Public Turing test to tell Computer and Humans Apart(CAPTCHA), merged character, segmentation, convolutional neural network