计算机工程与应用 ›› 2023, Vol. 59 ›› Issue (12): 62-76.DOI: 10.3778/j.issn.1002-8331.2209-0396
梁天恺,苏新铎,黄宇恒,徐天适,张华俊,曾碧
出版日期:
2023-06-15
发布日期:
2023-06-15
LIANG Tiankai, SU Xinduo, HUANG Yuheng, XU Tianshi, ZHANG Huajun, ZENG Bi
Online:
2023-06-15
Published:
2023-06-15
摘要: 在大数据和互联网的历史背景下,信息技术的发展伴随着大量文档的产生。作为数据关系直观体现的表格常见于文档中,表格的归档也是文档处理的重要任务之一。如何在海量的文档中快速地对表格进行自动化识别成为妨碍文档处理迈向智能化的关键因素。作为人工智能研究领域重要分支之一的表格识别,能实现表格对象和结构的自动化检测与识别,被广泛应用在文档智能化处理等场景。因此总结与综述表格识别领域的概念、技术、应用与挑战显得尤为重要。阐述表格识别的概念,指出表格识别任务可被分为表格检测和表格结构识别两大子任务。针对表格检测研究方向主流的anchor-based和anchor-free算法进行介绍和分析,总结不同算法的优缺点。分别阐述基于语义分割、基于双向割并、融合神经网络以及端到端等四大类别的主流的表格结构识别算法的原理和优缺点。同时分析并讨论目前常见的有机融合表格检测和表格结构识别的非端到端与端到端的表格识别算法。最后总结并指出表格识别的应用、挑战与展望。
梁天恺, 苏新铎, 黄宇恒, 徐天适, 张华俊, 曾碧. 智能化表格识别技术综述[J]. 计算机工程与应用, 2023, 59(12): 62-76.
LIANG Tiankai, SU Xinduo, HUANG Yuheng, XU Tianshi, ZHANG Huajun, ZENG Bi. Survey on Intelligent Table Recognition[J]. Computer Engineering and Applications, 2023, 59(12): 62-76.
[1] 李国杰,程学旗.大数据研究:未来科技及经济社会发展的重大战略领域——大数据的研究现状与科学思考[J].中国科学院院刊,2012,27(6):647-657. LI G J,CHENG X Q.Research status and scientific thinking of big data[J].Bulletin of the Chinese Academy of Sciences,2012,27(6):647-657. [2] 雷寰宇.基于图像的表格识别问题研究[J].科技视界,2021(13):32-34. LEI H Y.Research on the problem of image-based table recognition[J].Science and Technology Vision,2021(13):32-34. [3] KIENINGER T G.Table structure recognition based on robust block segmentation[J].Document Recognition SPIE,1998,3305:22-32. [4] ZANIBBI R,BLOSTEIN D,CORDY J R.A survey of table recognition:models,observations,transformations,and inferences[J].International Journal on Document Analysis and Recognition,2004,7(1):1-16. [5] 郑冶枫,刘长松,丁晓青,等.基于有向单连通链的表格框线检测算法[J].软件学报,2002,13(4):790-796. ZHENG Y F,LIU C S,DING X Q,et al.A form frame-line detection algorithm based on directional single-connected chain[J].Journal of Software,2002,13(4):790-796. [6] 唐锐,邓建新,叶志兴,等.PDF文件的表格抽取研究综述[J].计算机应用与软件,2021,38(7):1-7. TANG R,DENG J X,YE Z X,et al.Survey of table extraction in PDF documents[J].Computer Applications and Software,2021,38(7):1-7. [7] 赵洪,肖洪,薛德军,等.Web表格信息抽取研究综述[J].现代图书情报技术,2008(3):24-31. ZHAO H,XIAO H,XUE D J,el at.A survey of the research on information extraction over Web tables[J].New Technology of Library and Information Service,2008(3):24-31. [8] GOBEL M,HASSAN T,ORO E,et al.ICDAR 2013 table competition[C]//2013 12th International Conference on Document Analysis and Recognition,2013:1449-1453. [9] 李艳霞,孙羽菲,张玉志.受限表格识别系统的研究[J].计算机工程与应用,2006,42(31):161-163. LI Y X,SUN Y F,ZHANG Y Z.Constrained form recognition system[J].Computer Engineering and Applications,2006,42(31):161-163. [10] GAO L,HUANG Y,DEJEAN H,et al.ICDAR 2019 competition on table detection and recognition[C]//2019 International Conference on Document Analysis and Recognition(ICDAR),2019:1510-1515. [11] ZHANG D,MAO R,GUO R,et al.YOLO-table:disclosure document table detection with involution[J].International Journal on Document Analysis and Recognition(IJDAR),2022:1-14. [12] RAJA S,MONDAL A,JAWAHAR C V.Table structure recognition using top-down and bottom-up cues[C]//European Conference on Computer Vision,2020:70-86. [13] HASHMI K A,STRICKER D,LIWICKI M,et al.Guided table structure recognition through anchor optimization[J].IEEE Access,2021,9:113521-113534. [14] GOBEL M,HASSAN T,ORO E,et al.Table modelling,extraction and processing[C]//Proceedings of the 2016 ACM Symposium on Document Engineering,2016:1-2. [15] 王行荣,应俊.手写表格识别系统研究和实现[J].计算机科学,2008,35(6):268-271. WANG X R,YING J.Research and implementation of handwritten form recognition system[J].Computer Science,2008,35(6):268-271. [16] CHEN X,CHITICARIU L,DANILEVSKY M,et al.A rectangle mining method for understanding the semantics of financial tables[C]//2017 14th IAPR International Conference on Document Analysis and Recognition(ICDAR),2017:268-273. [17] REZA M M,BUKHARI S S,JENCKEL M,et al.Table localization and segmentation using gan and cnn[C]//2019 International Conference on Document Analysis and Recognition Workshops(ICDARW),2019:152-157. [18] WATANABE T,LUO Q,SUGIE N.Layout recognition of multi-kinds of table-form documents[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,1995,17(4):432-445. [19] BUSINGER A,RUEDI T P,SOMMER C.On-table reconstruction of comminuted fractures of the radial head[J].Injury,2010,41(6):583-588. [20] KASAR T,BHOWMIK T K,BELAID A.Table information extraction and structure recognition using query patterns[C]//2015 13th International Conference on Document Analysis and Recognition,2015:1086-1090. [21] FERNANDES J,SIMSEK M,KANTARCI B,et al.TableDet:an end-to-end deep learning approach for table detection and table image classification in data sheet images[J].Neurocomputing,2022,468:317-334. [22] HU J,KASHI R,LOPRESTI D,et al.Why table ground truthing is hard[C]//Proceedings of Sixth International Conference on Document Analysis and Recognition,2001:129-133. [23] 李彬,赵连军,刘帅.表格图像特征目标识别技术的研究[J].科技视界,2016(23):105-106. LI B,ZHAO L J,LIU S.Research on target automatic recognition technology based on table image[J].Science and Technology Vision,2016(23):105-106. [24] 郝红永.基于图像技术的表格结构识别研究[D].西安:西安电子科技大学,2010. HAO Y H.Research on table structure recognition based on image technology[D].Xi’an:Xidian University,2010. [25] KAVASIDIS I,PINO C,PALAZZO S,et al.A saliency-based convolutional neural network for table and chart detection in digitized documents[C]//International Conference on Image Analysis and Processing.Cham:Springer,2019:292-302. [26] WEI D,LU H,ZHOU Y,et al.Image-based table cell detection:a novel table structure decomposition method with new dataset[C]//2020 25th International Conference on Pattern Recognition,2021:1-7. [27] ZHANG Z,ZHANG J,DU J,et al.Split,embed and merge:an accurate table structure recognizer[J].Pattern Recognition,2022,126:108565. [28] PAAB G,KONYA I.Machine learning for document structure recognition[M]//Modeling,learning,and processing of text technological data structures.Berlin,Heidelberg:Springer,2011:221-247. [29] 张建东,陈仕吉,徐小婷,等.基于词向量的PDF表格抽取研究[J].数据分析与知识发现,2021,5(8):34-44. ZHANG J D,CHEN S J,XU X T,et al.Extracting PDF tables based on word vectors[J].Data Analysis and Knowledge Discovery,2021,5(8):34-44. [30] NAMYSL M,ESSER A M,BEHNKE S,et al.Flexible table recognition and semantic interpretation systemy[EB/OL].[2022-09-05].https://arxiv.org/pdf/2105.11879.pdf. [31] HU J,KASHI R S,LOPRESTI D,et al.Evaluating the performance of table processing algorithms[J].International Journal on Document Analysis and Recognition,2002,4(3):140-153. [32] RAMEL J Y,CRUCIANU M,VINCENT N,et al.Detection,extraction and representation of tables[C]//Seventh International Conference on Document Analysis and Recognition,2003:374-378. [33] TIAN Z,SHEN C,CHEN H,et al.Fcos:a simple and strong anchor-free object detector[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2020,44(4):1922-1933. [34] KONG T,SUN F,LIU H,et al.Foveabox:beyound anchor-based object detection[J].IEEE Transactions on Image Processing,2020,29:7389-7398. [35] RAMAN N,SHAH S,VELOSO M.Synthetic document generator for annotation-free layout recognition[J].Pattern Recognition,2022,128:108660. [36] 袁鸿雁.基于本体的Web表格信息抽取技术的研究[J].青岛大学学报(自然科学版),2010,23(2):47-51. YUAN H Y.Study on information extraction technique with web tables based on ontology[J].Journal of Qingdao University(Natural Science Edition),2010,23(2):47-51. [37] 王泽强,陈义明.一种基于YOLOv3和数学形态学的表格检测方法[J].电脑知识与技术,2021,17(2):14-16. WANG Z Q,CHEN Y M.Detecting table based on YOLOv3 and morphological function[J].Computer Knowledge and Technology,2021,17(2):14-16. [38] ZOU Z,SHI Z,GUO Y,et al.Object detection in 20 years:a survey[J].arXiv:1905.05055,2019. [39] YU B,TAO D.Anchor cascade for efficient face detection[J].IEEE Transactions on Image Processing,2018,28(5):2490-2501. [40] ZHANG S,CHI C,YAO Y,et al.Bridging the gap between anchor-based and anchor-free detection via adaptive training sample selection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,2020:9759-9768. [41] MO N,YAN L,ZHU R,et al.Class-specific anchor based and context-guided multi-class object detection in high resolution remote sensing imagery with a convolutional neural network[J].Remote Sensing,2019,11(3):272. [42] ZHU C,CHEN F,SHEN Z,et al.Soft anchor-point object detection[C]//European Conference on Computer Vision.Cham:Springer,2020:91-107. [43] LIU L,OUYANG W,WANG X,et al.Deep learning for generic object detection:a survey[J].International Journal of Computer Vision,2020,128(2):261-318. [44] JIAO L,ZHANG F,LIU F,et al.A survey of deep learning-based object detection[J].IEEE Access,2019,7:128837-128868. [45] 马志远,余粟.基于Faster-RCNN网络的表格检测算法研究[J].智能计算机与应用,2020,10(12):24-27. MA Z Y,YU L.Table detection algorithm based on Faster-RCNN[J].Intelligent Computer and Applications,2020,10(12):24-27. [46] SUN N,ZHU Y,HU X.Faster R-CNN based table detection combining corner locating[C]//2019 International Conference on Document Analysis and Recognition(ICDAR),2019:1314-1319. [47] CAI Z,VASCONCELOS N.Cascade R-CNN:delving into high quality object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2018:6154-6162. [48] AGARWAL M,MONDAL A,JAWAHAR C V.Cdec-net:composite deformable cascade network for table detection in document images[C]//2020 25th International Conference on Pattern Recognition,2021:9491-9498. [49] LIN T Y,DOLLAR P,GIRSHICK R,et al.Feature pyramid networks for object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2017:2117-2125. [50] 应自炉,赵毅鸿,宣晨,等.多特征融合的文档图像版面分析[J].中国图象图形学报,2020,25(2):311-320. YING Z L,ZHAO Y H,XUAN C,et al.Layout analysis of document images based on multifeature fusion[J].Journal of Image and Graphics,2020,25(2):311-320. [51] CHENG G,HAN J.A survey on object detection in optical remote sensing images[J].ISPRS Journal of Photogrammetry & Remote Sensing,2016,117:11-28. [52] 孔令军,包云超,王茜雯,等.基于深度学习的表格检测识别算法综述[J].计算机与网络,2021,47(2):65-73. KONG L J,BAO Y C,WANG Q W,et al.A summary of table detection and recognition algorithms based on deep learning[J].Computer and Network,2021,47(2):65-73. [53] HUANG Y,YAN Q,LI Y,et al.A YOLO-based table detection method[C]//2019 International Conference on Document Analysis and Recognition(ICDAR),2019:813-818. [54] REDMON J,FARHADI A.YOLOv3:an incremental improvement[EB/OL].[2022-09-06].https://arxiv.org/pdf/1804. 02767.pdf. [55] LIN T Y,GOYAL P,GIRSHICK R,et al.Focal loss for dense object detection[C]//Proceedings of the IEEE International Conference on Computer Vision,2017:2980-2988. [56] FERRYGUN.Build a parser to extract the table in PDF document with RetinaNet[EB/OL].[2022-09-06].https://medium.com/@djajafer/pdf-table-extraction-with-keras-retinanet-173a13371e89. [57] JOSEPH K J,KHAN S,KHAN F S,et al.Towards open world object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,2021:5830-5840. [58] ZHU C,HE Y,SAVVIDES M.Feature selective anchor-free module for single-shot object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,2019:840-849. [59] 赵逸如,李捷,王巍,等.表格提取方法、装置、计算机设备、存储介质,上海:CN114283435A[P].2022-04-05. ZHAO Y R,LI J,WANG W,et al.Table extraction method,apparatus,computer equipment,storage medium,Shanghai:CN114283435A[P].2022-04-05. [60] COUASNON B.A generic document recognition method:application to table structure analysis in a general and in a specific way[J].International Journal of Document Analysis and Recognition,2006,8(2/3):111-122. [61] SIDDIQUI S A,KHAN P I,DENGEL A,et al.Rethinking semantic segmentation for table structure recognition in documents[C]//2019 International Conference on Document Analysis and Recognition,2019:1397-1402. [62] SIDDIQUI S A,FATEH I A,RIZVI S T R,et al.DeepTabStR:deep learning based table structure recognition[C]//2019 International Conference on Document Analysis and Recognition,2019:1403-1409. [63] TENSMEYER C,MORARIU V I,PRICE B,et al.Deep splitting and merging for table structure decomposition[C]//2019 International Conference on Document Analysis and Recognition,2019:114-121. [64] QASIM S R,MAHMOOD H,SHAFAIT F.Rethinking table recognition using graph neural networks[C]//2019 International Conference on Document Analysis and Recognition,2019:142-147. [65] LI Y,HUANG Z,YAN J,et al.GFTE:graph-based financial table extraction[C]//International Conference on Pattern Recognition.Cham:Springer,2021:644-658. [66] YE J,QI X,HE Y,et al.PingAn-VCGroup’s solution for ICDAR 2021 competition on scientific literature parsing task B:table recognition to HTML[EB/OL].[2022-09-07].https://arxiv.org/pdf/2105.01848. [67] 秦振海,谭守标,徐超.基于Web的表格信息抽取研究[J].计算机技术与发展,2010(2):217-220. QIN Z H,TAN S B,XU C.Study on tables information extraction based on Web[J].Computer Technology and Development,2010(2):217-220. [68] 胡大洋.基于启发式规则的多记录页面分隔符识别方法[J].软件导刊,2009(9):50-51. HU D Y.Multi-record page separator recognition method based on heuristic rules[J].Software Guide,2009(9):50-51. [69] KIENINGER T,?DENGEL A.The T-recs table recognition and analysis system[C]//Lecture Notes in Computer Science,1999:255-270. [70] 田翠华,张一平,胡志钢,等.PDF文档表格信息的识别与提取[J].厦门理工学院学报,2020,28(3):70-76. TIAN C H,ZHANG Y P,HU Z G,et al.Recognition and extraction of table information from PDF documents[J].Journal of Xiamen University of Technology,2020,28(3):70-76. [71] SCHREIBER S,AGNE S,WOLF I,et al.Deepdesrt:deep learning for detection and structure recognition of tables in document images[C]//2017 14th IAPR International Conference on Document Analysis and Recognition,2017:1162-1167. [72] VIVIANYZW,YONKE.Table structure recognition from tencent[EB/OL].[2022-09-08].https://zhuanlan.zhihu.com/p/69793742. [73] 小马过河.表格识别方法综述[EB/OL].[2022-09-10].https://zhuanlan.zhihu.com/p/385673899. Xiaoma Guohe.Survey of table recognition methods[EB/OL].[2022-09-10].https://zhuanlan.zhihu.com/p/385673899. [74] PALIWAL S S,VISHWANATH D,RAHUL R,et al.Tablenet:deep learning model for end-to-end table detection and tabular data extraction from scanned document images[C]//2019 International Conference on Document Analysis and Recognition,2019:128-133. [75] PRASAD D,GADPAL A,KAPADNI K,et al.Cascade-TabNet:an approach for end to end table detection and structure recognition from image-based documents[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops,2020:572-573. [76] 仵冀颖.三篇论文,纵览深度学习在表格识别中的最新应用[EB/OL].[2022-09-10].https://mp.weixin.qq.com/s/LtLOgfjM0vwTa9HO1qPNHw. WU J Y.Three papers,an overview of state-of-the-art applications of deep learning in table recognition[EB/OL].[2022-09-10].https://mp.weixin.qq.com/s/LtLOgfjM0vwTa9HO1qPNHw. |
[1] | 陈吉尚, 哈里旦木·阿布都克里木, 梁蕴泽, 阿布都克力木·阿布力孜, 米克拉依·艾山, 郭文强. 深度学习在符号音乐生成中的应用研究综述[J]. 计算机工程与应用, 2023, 59(9): 27-45. |
[2] | 李瑾晨, 李艳玲, 葛凤培, 林民. 面向法律领域的智能系统研究综述[J]. 计算机工程与应用, 2023, 59(7): 31-50. |
[3] | 郑剑, 余鑫. 使用均值距离与关联性标记的并行OPTICS算法[J]. 计算机工程与应用, 2023, 59(5): 232-244. |
[4] | 孙书魁, 范菁, 李占稳, 曲金帅, 路佩东. 人工智能在新型冠状病毒肺炎中的研究综述[J]. 计算机工程与应用, 2023, 59(5): 28-39. |
[5] | 赵立阳, 常天庆, 褚凯轩, 郭理彬, 张雷. 完全合作类多智能体深度强化学习综述[J]. 计算机工程与应用, 2023, 59(12): 14-27. |
[6] | 王正安, 徐贞顺, 林令德. 新冠肺炎疫情传播预测方法综述[J]. 计算机工程与应用, 2023, 59(12): 49-61. |
[7] | 张启阳, 陈希亮, 曹雷, 赖俊. 基于好奇心机制改进的策略优化算法[J]. 计算机工程与应用, 2023, 59(11): 63-70. |
[8] | 伍洲, 张洪瑞, 张海军, 宋晴. 近邻场优化算法研究与应用综述[J]. 计算机工程与应用, 2022, 58(9): 1-8. |
[9] | 蔡启明, 张磊, 许宸豪. 基于单层神经网络的流程相似性的研究[J]. 计算机工程与应用, 2022, 58(7): 295-302. |
[10] | 汪玉, 王鑫, 张淑娟, 郑国强, 赵龙, 郑高峰. 异构大数据环境中高效率知识融合方法的研究[J]. 计算机工程与应用, 2022, 58(6): 142-148. |
[11] | 陈智丽, 高皓, 潘以轩, 邢风. 乳腺X线图像计算机辅助诊断技术综述[J]. 计算机工程与应用, 2022, 58(4): 1-21. |
[12] | 鞠思博, 徐晶, 李岩芳. 基于自注意力机制的文本生成单目标图像方法[J]. 计算机工程与应用, 2022, 58(3): 249-258. |
[13] | 张宇童, 李启元, 刘树衎. 表格检测与结构识别综述[J]. 计算机工程与应用, 2022, 58(22): 1-11. |
[14] | 刘志飞, 曹雷, 赖俊, 陈希亮, 陈英. 多智能体路径规划综述[J]. 计算机工程与应用, 2022, 58(20): 43-64. |
[15] | 王瑞平, 吴士泓, 张美航, 王小平. 视觉问答语言处理方法综述[J]. 计算机工程与应用, 2022, 58(17): 50-60. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||