表格检测与结构识别综述

doi:10.3778/j.issn.1002-8331.2206-0337

摘要/Abstract

摘要： 针对当前文档分析领域中表格分析的发展现状，整理了近年来领域内的相关文献，分别对表格检测和表格结构识别两个关键任务进行研究。针对表格检测任务，将其划分为基于目标检测、图神经网络、生成对抗网络、可变卷积网络的方法；针对表格结构识别任务，将其划分为基于目标检测、图神经网络、循环神经网络、可变卷积与扩张卷积网络的方法。总结了各类模型的方法路径和局限性，梳理了相关任务及其对应的数据集。更广泛地总结了表格分析领域常用的公开数据集，并对各数据集的来源、规模、适用范围及文件类型进行详细介绍。列举了表格分析领域常用的评价指标，并按照实验数据集的不同对现有模型的实验结果进行对比。总结了当前表格分析领域的发展状况，并对未来发展方向进行了展望。

关键词: 表格检测, 结构识别, 深度学习, 数据集, 评价指标

Abstract: In view of the current development of table analysis in document analysis, the recent literature relevant to this field is sorted out, and the two key tasks, table detection and table structure recognition, are studied. For table detection, methods are divided into those based on object detection, graph neural network, generative adversarial network and deformable convolutional network. For table structure recognition, methods include those based on object detection, graph neural network, recurrent neural network, deformable convolutional and dilated convolutional network. The methods and limitations of various models are summarized, and the related tasks and their corresponding datasets are sorted out. The common open-source datasets in table analysis are summarized more widely, and the source, scale, scope of application and file type of each dataset are introduced in detail. The commonly used evaluation metrics in table analysis are listed, and the experimental results of existing models are compared in respect of different experimental datasets. The current development of table analysis is summarized, and the future tendency is discussed.

Key words: table detection, structure recognition, deep learning, dataset, evaluation metrics

张宇童, 李启元, 刘树衎. 表格检测与结构识别综述[J]. 计算机工程与应用, 2022, 58(22): 1-11.

ZHANG Yutong, LI Qiyuan, LIU Shukan. Overview of Table Detection and Structure Recognition[J]. Computer Engineering and Applications, 2022, 58(22): 1-11.

参考文献

[1] KIM Y S，LEE K H.Extracting logical structures from HTML tables[J].Computer Standards & Interfaces，2008，30（5）：296-308.
[2] MASUDA H，TSUKAMOTO S，YASUTOMI S，et al.Recognition of HTML table structure[C]//2004 International Joint Conference on Natural Language Processing，2004.
[3] TRAN D N，TRAN T A，OH A，et al.Table detection from document image using vertical arrangement of text blocks[J].International Journal of Contents，2015，11（4）：77-85.
[4] 胡明晓，DING L X.一种用于抄袭识别的文档距离度量[J].计算机工程与应用，2010，46（7）：148-152.
HU M X，DING L X.Document distance metric used in plagiarism detection[J].Computer Engineering and Applications，2010，46（7）：148-152.
[5] 何清，李宁，罗文娟，等.大数据下的机器学习算法综述[J].模式识别与人工智能，2014，27（4）：327-336.
HE Q，LI N，LUO W J，et al.A survey of machine learning algorithms for big data[J].Pattern Recognition and Artificial Intelligence，2014，27（4）：327-336.
[6] KASAR T，BARLAS P，ADAM S，et al.Learning to detect tables in scanned document images using line information[C]//2013 12th International Conference on Document Analysis and Recognition，2013：1185-1189.
[7] SAUNDERS C，STITSON M O，WESTON J，et al.Support vector machine[J].Computer Science，2002，1（4）：1-28.
[8] NOBLE W S.What is a support vector machine?[J].Nature Biotechnology，2006，24（12）：1565-1567.
[9] FAN M，KIM D S.Detecting table region in PDF documents using distant supervision[J].arXiv：1506.08891，2015.
[10] 厉旭杰.GPU加速的图像匹配技术[J].计算机工程与应用，2012，48（2）：173-176.
LI X J.GPU-acceleration of parallelized image matching algorithm[J].Computer Engineering and Applications，2012，48（2）：173-176.
[11] HAO L，GAO L，YI X，et al.A table detection method for PDF documents based on convolutional neural networks[C]//2016 12th IAPR Workshop on Document Analysis Systems，2016：287-292.
[12] GILANI A，QASIM S R，MALIK I，et al.Table detection using deep learning[C]//2017 14th IAPR International Conference on Document Analysis and Recognition，2017，1：771-776.
[13] BAILEY D G.An efficient Euclidean distance transform[C]//2004 International Workshop on Combinatorial Image Analysis.Berlin，Heidelberg：Springer，2004：394-408.
[14] GUSTAVSON S，STRAND R.Anti-aliased Euclidean distance transform[J].Pattern Recognition Letters，2011，32（2）：252-257.
[15] REN S，HE K，GIRSHICK R，et al.Faster R-CNN：towards real-time object detection with region proposal networks[C]//Advances in Neural Information Processing Systems 28，2015：91-99.
[16] JIANG H，LEARNED-MILLER E.Face detection with the faster R-CNN[C]//2017 12th IEEE International Conference on Automatic Face & Gesture Recognition，2017：650-657.
[17] CHEN Y P，LI Y，WANG G，et al.A multi-strategy region proposal network[J].Expert Systems with Applications，2018，113：1-17.
[18] ZEILER M D，FERGUS R.Visualizing and understanding convolutional networks[C]//13th European Conference on Computer Vision.Cham：Springer，2014：818-833.
[19] SHAHAB A，SHAFAIT F，KIENINGER T，et al.An open approach towards the benchmarking of table structure recognition systems[C]//9th IAPR International Workshop on Document Analysis Systems，2010：113-120.
[20] SCHREIBER S，AGNE S，WOLF I，et al.DeepDeSRT：deep learning for detection and structure recognition of tables in document images[C]//2017 14th IAPR International Conference on Document Analysis and Recognition，2017：1162-1167.
[21] EVERINGHAM M，VAN GOOL L，WILLIAMS C K I，et al.The pascal visual object classes（VOC） challenge[J].International Journal of Computer Vision，2010，88（2）：303-338.
[22] SIMONYAN K，ZISSERMAN A.Very deep convolutional networks for large-scale image recognition[J].arXiv：1409.
1556，2014.
[23] ARIF S，SHAFAIT F.Table detection in document images using foreground and background features[C]//2018 Digital Image Computing：Techniques and Applications，2018：1-8.
[24] LI M，CUI L，HUANG S，et al.TableBank：table benchmark for image-based table detection and recognition[C]//12th Language Resources and Evaluation Conference，2020：1918-1925.
[25] SUN N，ZHU Y，HU X.Faster R-CNN based table detection combining corner locating[C]//2019 International Conference on Document Analysis and Recognition，2019：1314-1319.
[26] GAO L，YI X，JIANG Z，et al.ICDAR2017 competition on page object detection[C]//2017 14th IAPR International Conference on Document Analysis and Recognition，2017：1417-1422.
[27] HE K，GKIOXARI G，DOLLáR P，et al.Mask R-CNN[C]//2017 IEEE International Conference on Computer Vision，2017：2961-2969.
[28] PRASAD D，GADPAL A，KAPADNI K，et al.CascadeTabNet：an approach for end to end table detection and structure recognition from image-based documents[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops，2020：572-573.
[29] CAI Z，VASCONCELOS N.Cascade R-CNN：delving into high quality object detection[C]//2018 IEEE Conference on Computer Vision and Pattern Recognition，2018：6154-6162.
[30] WANG J，SUN K，CHENG T，et al.Deep high-resolution representation learning for visual recognition[J].IEEE Transactions on Pattern Analysis and Machine Intelligence，2020，43（10）：3349-3364.
[31] G?BEL M，HASSAN T，ORO E，et al.ICDAR 2013 table competition[C]//2013 12th International Conference on Docu-
ment Analysis and Recognition，2013：1449-1453.
[32] GAO L，HUANG Y，DéJEAN H，et al.ICDAR 2019 competition on table detection and recognition（cTDaR）[C]//2019 International Conference on Document Analysis and Recognition，2019：1510-1515.
[33] ZHENG X，BURDICK D，POPA L，et al.Global table extractor（GTE）：a framework for joint table identification and cell structure recognition using visual context[C]//2021 IEEE/CVF Winter Conference on Applications of Computer Vision，2021：697-706.
[34] REDMON J，DIVVALA S，GIRSHICK R，et al.You only look once：unified，real-time object detection[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition，2016：779-788.
[35] LAN W，DANG J，WANG Y，et al.Pedestrian detection based on YOLO network model[C]//2018 IEEE International Conference on Mechatronics and Automation，2018：1547-1551.
[36] HUANG R，PEDOEEM J，CHEN C.YOLO-LITE：a real-time object detection algorithm optimized for non-GPU computers[C]//2018 IEEE International Conference on Big Data，2018：2503-2510.
[37] HUANG Y，YAN Q，LI Y，et al.A YOLO-based table detection method[C]//2019 International Conference on Document Analysis and Recognition，2019：813-818.
[38] SCARSELLI F，GORI M，TSOI A C，et al.The graph neural network model[J].IEEE Transactions on Neural Networks，2008，20（1）：61-80.
[39] SHCHUR O，MUMME M，BOJCHEVSKI A，et al.Pitfalls of graph neural network evaluation[J].arXiv：1811.05868，2018.
[40] THEKUMPARAMPIL K K，WANG C，OH S，et al.Attention-based graph neural network for semi-supervised learning[J].arXiv：1803.03735，2018.
[41] RIBA P，DUTTA A，GOLDMANN L，et al.Table detection in invoice documents by graph neural networks[C]//2019 International Conference on Document Analysis and Recognition，2019：122-127.
[42] HOLE?EK M，HOSKOVEC A，BAUDI? P，et al.Table understanding in structured documents[C]//2019 International Conference on Document Analysis and Recognition Workshops，2019，5：158-164.
[43] GOODFELLOW I，POUGET-ABADIE J，MIRZA M，et al.Generative adversarial nets[C]//2014 International Conference on Neural Information Processing Systems，2014：2672-2680.
[44] LI Y，GAO L，TANG Z，et al.A GAN-based feature gene-
rator for table detection[C]//2019 International Conference on Document Analysis and Recognition，2019：763-768.
[45] RONNEBERGER O，FISCHER P，BROX T.U-net：convolutional networks for biomedical image segmentation[C]//2015 International Conference on Medical Image Computing and Computer-Assisted Intervention.Cham：Springer，2015：234-241.
[46] REZA M M，BUKHARI S S，JENCKEL M，et al.Table localization and segmentation using GAN and CNN[C]//2019 International Conference on Document Analysis and Recognition Workshops，2019：152-157.
[47] WANG T C，LIU M Y，ZHU J Y，et al.High-resolution image synthesis and semantic manipulation with conditional gans[C]//2018 IEEE Conference on Computer Vision and Pattern Recognition，2018：8798-8807.
[48] SIDDIQUI S A，MALIK M I，AGNE S，et al.Decnt：deep deformable CNN for table detection[J].IEEE Access，2018，6：74151-74161.
[49] DAI J，QI H，XIONG Y，et al.Deformable convolutional networks[C]//2017 IEEE International Conference on Computer Vision，2017：764-773.
[50] ZHU J，FANG L，GHAMISI P.Deformable convolutional neural networks for hyperspectral image classification[J].IEEE Geoscience and Remote Sensing Letters，2018，15（8）：1254-1258.
[51] FANG J，TAO X，TANG Z，et al.Dataset，ground-truth and performance metrics for table detection evaluation[C]//2012 10th IAPR International Workshop on Document Analysis Systems，2012：445-449.
[52] AGARWAL M，MONDAL A，JAWAHAR C V.CDEC-Net：composite deformable cascade network for table detection in document images[C]//2020 25th International Conference on Pattern Recognition，2020：9491-9498.
[53] XIE S，GIRSHICK R，DOLLáR P，et al.Aggregated residual transformations for deep neural networks[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition，2017：1492-1500.
[54] REDMON J，FARHADI A.Yolov3：an incremental improvement[J].arXiv：1804.02767，2018.
[55] HASHMI K A，STRICKER D，LIWICKI M，et al.Guided table structure recognition through anchor optimization[J].IEEE Access，2021，9：113521-113534.
[56] WANG J，CHEN K，YANG S，et al.Region proposal by guided anchoring[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition，2019：2965-2974.
[57] SIDDIQUI S A，FATEH I A，RIZVI S T R，et al.DeepTabStR：deep learning based table structure recognition[C]//2019 International Conference on Document Analysis and Recognition，2019：1403-1409.
[58] RAJA S，MONDAL A，JAWAHAR C V.Table structure recognition using top-down and bottom-up cues[C]//16th European Conference on Computer Vision.Cham：Springer，2020：70-86.
[59] LIN T Y，MAIRE M，BELONGIE S，et al.Microsoft COCO：common objects in context[C]//13th European Conference on Computer Vision.Cham：Springer，2014：740-755.
[60] ZHANG Z，WANG X，JUNG C.DCSR：dilated convolutions for single image super-resolution[J].IEEE Transactions on Image Processing，2018，28（4）：1625-1635.
[61] ZHANG S，TONG H，XU J，et al.Graph convolutional networks：a comprehensive review[J].Computational Social Networks，2019，6（1）：1-23.
[62] QIAO L，LI Z，CHENG Z，et al.LGPMA：complicated table structure recognition with local and global pyramid mask alignment[C]//16th International Conference on Document Analysis and Recognition.Cham：Springer，2021：99-114.
[63] LONG R，WANG W，XUE N，et al.Parsing table structures in the wild[C]//2021 IEEE/CVF International Conference on Computer Vision，2021：944-952.
[64] DUAN K，BAI S，XIE L，et al.CenterNet：keypoint triplets for object detection[C]//2019 IEEE/CVF International Conference on Computer Vision，2019：6569-6578.
[65] QASIM S R，MAHMOOD H，SHAFAIT F.Rethinking table recognition using graph neural networks[C]//2019 International Conference on Document Analysis and Recognition，2019：142-147.
[66] CHI Z，HUANG H，XU H D，et al.Complicated table structure recognition[J].arXiv：1908.04729，2019.
[67] LI Y，HUANG Z，YAN J，et al.GFTE：graph-based financial table extraction[C]//2021 International Conference on Pattern Recognition.Cham：Springer，2021：644-658.
[68] XUE W，YU B，WANG W，et al.TGRNet：a table graph reconstruction network for table structure recognition[C]//2021 IEEE/CVF International Conference on Computer Vision，2021：1295-1304.
[69] KHAN S A，KHALID S M D，SHAHZAD M A，et al.Table structure extraction with bi-directional gated recurrent unit networks[C]//2019 International Conference on Document Analysis and Recognition，2019：1366-1371.
[70] MEDSKER L R，JAIN L C.Recurrent neural networks[J].Design and Applications，2001，5：64-67.
[71] MIKOLOV T，KARAFIáT M，BURGET L，et al.Recurrent neural network based language model[C]//11th Annual Conference of the International Speech Communication Association，2010：1045-1048.
[72] ZAREMBA W，SUTSKEVER I，VINYALS O.Recurrent neural network regularization[J].arXiv：1409.2329，2014.
[73] DEY R，SALEM F M.Gate-variants of gated recurrent unit（GRU） neural networks[C]//2017 IEEE 60th International Midwest Symposium on Circuits and Systems，2017：1597-1600.
[74] YU Y，SI X，HU C，et al.A review of recurrent neural networks：LSTM cells and network architectures[J].Neural Computation，2019，31（7）：1235-1270.
[75] TENSMEYER C，MORARIU V I，PRICE B，et al.Deep splitting and merging for table structure decomposition[C]//2019 International Conference on Document Analysis and Recognition，2019：114-121.
[76] ZHONG X，SHAFIEIBAVANI E，JIMENO YEPES A.Image-based table recognition：data，model，and evaluation[C]//16th European Conference on Computer Vision.Cham：Springer，2020：564-580.
[77] NASSAR A，LIVATHINOS N，LYSAK M，et al.TableFormer：table structure understanding with transformers[C]//2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition，2022：4614-4623.
[78] ABDALLAH A，BERENDEYEV A，NURADIN I，et al.TNCR：table net detection and classification dataset[J].Neurocomputing，2022，473：79-97.
[79] 王浩，雷印杰，陈浩楠.改进YOLOV3实时交通标志检测算法[J].计算机工程与应用，2022，58（8）：243-248.
WANG H，LEI Y J，CHEN H N.Real time traffic sign detection algorithm based on improved YOLOV3[J].Computer Engineering and Applications，2022，58（8）：243-248.

编辑推荐 0

Metrics

阅读次数

全文

213

HTML			PDF

最新录用	在线预览	正式出版	最新录用	在线预览	正式出版
0	0	0	0	0	213

	来源	本网站

	次数	213
	比例	100%

摘要

282

最新录用	在线预览	正式出版

0	0	282

	来源	本网站

	次数	282
	比例	100%