计算机工程与应用 ›› 2022, Vol. 58 ›› Issue (22): 1-11.DOI: 10.3778/j.issn.1002-8331.2206-0337
张宇童,李启元,刘树衎
出版日期:
2022-11-15
发布日期:
2022-11-15
ZHANG Yutong, LI Qiyuan, LIU Shukan
Online:
2022-11-15
Published:
2022-11-15
摘要: 针对当前文档分析领域中表格分析的发展现状,整理了近年来领域内的相关文献,分别对表格检测和表格结构识别两个关键任务进行研究。针对表格检测任务,将其划分为基于目标检测、图神经网络、生成对抗网络、可变卷积网络的方法;针对表格结构识别任务,将其划分为基于目标检测、图神经网络、循环神经网络、可变卷积与扩张卷积网络的方法。总结了各类模型的方法路径和局限性,梳理了相关任务及其对应的数据集。更广泛地总结了表格分析领域常用的公开数据集,并对各数据集的来源、规模、适用范围及文件类型进行详细介绍。列举了表格分析领域常用的评价指标,并按照实验数据集的不同对现有模型的实验结果进行对比。总结了当前表格分析领域的发展状况,并对未来发展方向进行了展望。
张宇童, 李启元, 刘树衎. 表格检测与结构识别综述[J]. 计算机工程与应用, 2022, 58(22): 1-11.
ZHANG Yutong, LI Qiyuan, LIU Shukan. Overview of Table Detection and Structure Recognition[J]. Computer Engineering and Applications, 2022, 58(22): 1-11.
[1] KIM Y S,LEE K H.Extracting logical structures from HTML tables[J].Computer Standards & Interfaces,2008,30(5):296-308. [2] MASUDA H,TSUKAMOTO S,YASUTOMI S,et al.Recognition of HTML table structure[C]//2004 International Joint Conference on Natural Language Processing,2004. [3] TRAN D N,TRAN T A,OH A,et al.Table detection from document image using vertical arrangement of text blocks[J].International Journal of Contents,2015,11(4):77-85. [4] 胡明晓,DING L X.一种用于抄袭识别的文档距离度量[J].计算机工程与应用,2010,46(7):148-152. HU M X,DING L X.Document distance metric used in plagiarism detection[J].Computer Engineering and Applications,2010,46(7):148-152. [5] 何清,李宁,罗文娟,等.大数据下的机器学习算法综述[J].模式识别与人工智能,2014,27(4):327-336. HE Q,LI N,LUO W J,et al.A survey of machine learning algorithms for big data[J].Pattern Recognition and Artificial Intelligence,2014,27(4):327-336. [6] KASAR T,BARLAS P,ADAM S,et al.Learning to detect tables in scanned document images using line information[C]//2013 12th International Conference on Document Analysis and Recognition,2013:1185-1189. [7] SAUNDERS C,STITSON M O,WESTON J,et al.Support vector machine[J].Computer Science,2002,1(4):1-28. [8] NOBLE W S.What is a support vector machine?[J].Nature Biotechnology,2006,24(12):1565-1567. [9] FAN M,KIM D S.Detecting table region in PDF documents using distant supervision[J].arXiv:1506.08891,2015. [10] 厉旭杰.GPU加速的图像匹配技术[J].计算机工程与应用,2012,48(2):173-176. LI X J.GPU-acceleration of parallelized image matching algorithm[J].Computer Engineering and Applications,2012,48(2):173-176. [11] HAO L,GAO L,YI X,et al.A table detection method for PDF documents based on convolutional neural networks[C]//2016 12th IAPR Workshop on Document Analysis Systems,2016:287-292. [12] GILANI A,QASIM S R,MALIK I,et al.Table detection using deep learning[C]//2017 14th IAPR International Conference on Document Analysis and Recognition,2017,1:771-776. [13] BAILEY D G.An efficient Euclidean distance transform[C]//2004 International Workshop on Combinatorial Image Analysis.Berlin,Heidelberg:Springer,2004:394-408. [14] GUSTAVSON S,STRAND R.Anti-aliased Euclidean distance transform[J].Pattern Recognition Letters,2011,32(2):252-257. [15] REN S,HE K,GIRSHICK R,et al.Faster R-CNN:towards real-time object detection with region proposal networks[C]//Advances in Neural Information Processing Systems 28,2015:91-99. [16] JIANG H,LEARNED-MILLER E.Face detection with the faster R-CNN[C]//2017 12th IEEE International Conference on Automatic Face & Gesture Recognition,2017:650-657. [17] CHEN Y P,LI Y,WANG G,et al.A multi-strategy region proposal network[J].Expert Systems with Applications,2018,113:1-17. [18] ZEILER M D,FERGUS R.Visualizing and understanding convolutional networks[C]//13th European Conference on Computer Vision.Cham:Springer,2014:818-833. [19] SHAHAB A,SHAFAIT F,KIENINGER T,et al.An open approach towards the benchmarking of table structure recognition systems[C]//9th IAPR International Workshop on Document Analysis Systems,2010:113-120. [20] SCHREIBER S,AGNE S,WOLF I,et al.DeepDeSRT:deep learning for detection and structure recognition of tables in document images[C]//2017 14th IAPR International Conference on Document Analysis and Recognition,2017:1162-1167. [21] EVERINGHAM M,VAN GOOL L,WILLIAMS C K I,et al.The pascal visual object classes(VOC) challenge[J].International Journal of Computer Vision,2010,88(2):303-338. [22] SIMONYAN K,ZISSERMAN A.Very deep convolutional networks for large-scale image recognition[J].arXiv:1409. 1556,2014. [23] ARIF S,SHAFAIT F.Table detection in document images using foreground and background features[C]//2018 Digital Image Computing:Techniques and Applications,2018:1-8. [24] LI M,CUI L,HUANG S,et al.TableBank:table benchmark for image-based table detection and recognition[C]//12th Language Resources and Evaluation Conference,2020:1918-1925. [25] SUN N,ZHU Y,HU X.Faster R-CNN based table detection combining corner locating[C]//2019 International Conference on Document Analysis and Recognition,2019:1314-1319. [26] GAO L,YI X,JIANG Z,et al.ICDAR2017 competition on page object detection[C]//2017 14th IAPR International Conference on Document Analysis and Recognition,2017:1417-1422. [27] HE K,GKIOXARI G,DOLLáR P,et al.Mask R-CNN[C]//2017 IEEE International Conference on Computer Vision,2017:2961-2969. [28] PRASAD D,GADPAL A,KAPADNI K,et al.CascadeTabNet:an approach for end to end table detection and structure recognition from image-based documents[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops,2020:572-573. [29] CAI Z,VASCONCELOS N.Cascade R-CNN:delving into high quality object detection[C]//2018 IEEE Conference on Computer Vision and Pattern Recognition,2018:6154-6162. [30] WANG J,SUN K,CHENG T,et al.Deep high-resolution representation learning for visual recognition[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2020,43(10):3349-3364. [31] G?BEL M,HASSAN T,ORO E,et al.ICDAR 2013 table competition[C]//2013 12th International Conference on Docu- ment Analysis and Recognition,2013:1449-1453. [32] GAO L,HUANG Y,DéJEAN H,et al.ICDAR 2019 competition on table detection and recognition(cTDaR)[C]//2019 International Conference on Document Analysis and Recognition,2019:1510-1515. [33] ZHENG X,BURDICK D,POPA L,et al.Global table extractor(GTE):a framework for joint table identification and cell structure recognition using visual context[C]//2021 IEEE/CVF Winter Conference on Applications of Computer Vision,2021:697-706. [34] REDMON J,DIVVALA S,GIRSHICK R,et al.You only look once:unified,real-time object detection[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition,2016:779-788. [35] LAN W,DANG J,WANG Y,et al.Pedestrian detection based on YOLO network model[C]//2018 IEEE International Conference on Mechatronics and Automation,2018:1547-1551. [36] HUANG R,PEDOEEM J,CHEN C.YOLO-LITE:a real-time object detection algorithm optimized for non-GPU computers[C]//2018 IEEE International Conference on Big Data,2018:2503-2510. [37] HUANG Y,YAN Q,LI Y,et al.A YOLO-based table detection method[C]//2019 International Conference on Document Analysis and Recognition,2019:813-818. [38] SCARSELLI F,GORI M,TSOI A C,et al.The graph neural network model[J].IEEE Transactions on Neural Networks,2008,20(1):61-80. [39] SHCHUR O,MUMME M,BOJCHEVSKI A,et al.Pitfalls of graph neural network evaluation[J].arXiv:1811.05868,2018. [40] THEKUMPARAMPIL K K,WANG C,OH S,et al.Attention-based graph neural network for semi-supervised learning[J].arXiv:1803.03735,2018. [41] RIBA P,DUTTA A,GOLDMANN L,et al.Table detection in invoice documents by graph neural networks[C]//2019 International Conference on Document Analysis and Recognition,2019:122-127. [42] HOLE?EK M,HOSKOVEC A,BAUDI? P,et al.Table understanding in structured documents[C]//2019 International Conference on Document Analysis and Recognition Workshops,2019,5:158-164. [43] GOODFELLOW I,POUGET-ABADIE J,MIRZA M,et al.Generative adversarial nets[C]//2014 International Conference on Neural Information Processing Systems,2014:2672-2680. [44] LI Y,GAO L,TANG Z,et al.A GAN-based feature gene- rator for table detection[C]//2019 International Conference on Document Analysis and Recognition,2019:763-768. [45] RONNEBERGER O,FISCHER P,BROX T.U-net:convolutional networks for biomedical image segmentation[C]//2015 International Conference on Medical Image Computing and Computer-Assisted Intervention.Cham:Springer,2015:234-241. [46] REZA M M,BUKHARI S S,JENCKEL M,et al.Table localization and segmentation using GAN and CNN[C]//2019 International Conference on Document Analysis and Recognition Workshops,2019:152-157. [47] WANG T C,LIU M Y,ZHU J Y,et al.High-resolution image synthesis and semantic manipulation with conditional gans[C]//2018 IEEE Conference on Computer Vision and Pattern Recognition,2018:8798-8807. [48] SIDDIQUI S A,MALIK M I,AGNE S,et al.Decnt:deep deformable CNN for table detection[J].IEEE Access,2018,6:74151-74161. [49] DAI J,QI H,XIONG Y,et al.Deformable convolutional networks[C]//2017 IEEE International Conference on Computer Vision,2017:764-773. [50] ZHU J,FANG L,GHAMISI P.Deformable convolutional neural networks for hyperspectral image classification[J].IEEE Geoscience and Remote Sensing Letters,2018,15(8):1254-1258. [51] FANG J,TAO X,TANG Z,et al.Dataset,ground-truth and performance metrics for table detection evaluation[C]//2012 10th IAPR International Workshop on Document Analysis Systems,2012:445-449. [52] AGARWAL M,MONDAL A,JAWAHAR C V.CDEC-Net:composite deformable cascade network for table detection in document images[C]//2020 25th International Conference on Pattern Recognition,2020:9491-9498. [53] XIE S,GIRSHICK R,DOLLáR P,et al.Aggregated residual transformations for deep neural networks[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition,2017:1492-1500. [54] REDMON J,FARHADI A.Yolov3:an incremental improvement[J].arXiv:1804.02767,2018. [55] HASHMI K A,STRICKER D,LIWICKI M,et al.Guided table structure recognition through anchor optimization[J].IEEE Access,2021,9:113521-113534. [56] WANG J,CHEN K,YANG S,et al.Region proposal by guided anchoring[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition,2019:2965-2974. [57] SIDDIQUI S A,FATEH I A,RIZVI S T R,et al.DeepTabStR:deep learning based table structure recognition[C]//2019 International Conference on Document Analysis and Recognition,2019:1403-1409. [58] RAJA S,MONDAL A,JAWAHAR C V.Table structure recognition using top-down and bottom-up cues[C]//16th European Conference on Computer Vision.Cham:Springer,2020:70-86. [59] LIN T Y,MAIRE M,BELONGIE S,et al.Microsoft COCO:common objects in context[C]//13th European Conference on Computer Vision.Cham:Springer,2014:740-755. [60] ZHANG Z,WANG X,JUNG C.DCSR:dilated convolutions for single image super-resolution[J].IEEE Transactions on Image Processing,2018,28(4):1625-1635. [61] ZHANG S,TONG H,XU J,et al.Graph convolutional networks:a comprehensive review[J].Computational Social Networks,2019,6(1):1-23. [62] QIAO L,LI Z,CHENG Z,et al.LGPMA:complicated table structure recognition with local and global pyramid mask alignment[C]//16th International Conference on Document Analysis and Recognition.Cham:Springer,2021:99-114. [63] LONG R,WANG W,XUE N,et al.Parsing table structures in the wild[C]//2021 IEEE/CVF International Conference on Computer Vision,2021:944-952. [64] DUAN K,BAI S,XIE L,et al.CenterNet:keypoint triplets for object detection[C]//2019 IEEE/CVF International Conference on Computer Vision,2019:6569-6578. [65] QASIM S R,MAHMOOD H,SHAFAIT F.Rethinking table recognition using graph neural networks[C]//2019 International Conference on Document Analysis and Recognition,2019:142-147. [66] CHI Z,HUANG H,XU H D,et al.Complicated table structure recognition[J].arXiv:1908.04729,2019. [67] LI Y,HUANG Z,YAN J,et al.GFTE:graph-based financial table extraction[C]//2021 International Conference on Pattern Recognition.Cham:Springer,2021:644-658. [68] XUE W,YU B,WANG W,et al.TGRNet:a table graph reconstruction network for table structure recognition[C]//2021 IEEE/CVF International Conference on Computer Vision,2021:1295-1304. [69] KHAN S A,KHALID S M D,SHAHZAD M A,et al.Table structure extraction with bi-directional gated recurrent unit networks[C]//2019 International Conference on Document Analysis and Recognition,2019:1366-1371. [70] MEDSKER L R,JAIN L C.Recurrent neural networks[J].Design and Applications,2001,5:64-67. [71] MIKOLOV T,KARAFIáT M,BURGET L,et al.Recurrent neural network based language model[C]//11th Annual Conference of the International Speech Communication Association,2010:1045-1048. [72] ZAREMBA W,SUTSKEVER I,VINYALS O.Recurrent neural network regularization[J].arXiv:1409.2329,2014. [73] DEY R,SALEM F M.Gate-variants of gated recurrent unit(GRU) neural networks[C]//2017 IEEE 60th International Midwest Symposium on Circuits and Systems,2017:1597-1600. [74] YU Y,SI X,HU C,et al.A review of recurrent neural networks:LSTM cells and network architectures[J].Neural Computation,2019,31(7):1235-1270. [75] TENSMEYER C,MORARIU V I,PRICE B,et al.Deep splitting and merging for table structure decomposition[C]//2019 International Conference on Document Analysis and Recognition,2019:114-121. [76] ZHONG X,SHAFIEIBAVANI E,JIMENO YEPES A.Image-based table recognition:data,model,and evaluation[C]//16th European Conference on Computer Vision.Cham:Springer,2020:564-580. [77] NASSAR A,LIVATHINOS N,LYSAK M,et al.TableFormer:table structure understanding with transformers[C]//2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition,2022:4614-4623. [78] ABDALLAH A,BERENDEYEV A,NURADIN I,et al.TNCR:table net detection and classification dataset[J].Neurocomputing,2022,473:79-97. [79] 王浩,雷印杰,陈浩楠.改进YOLOV3实时交通标志检测算法[J].计算机工程与应用,2022,58(8):243-248. WANG H,LEI Y J,CHEN H N.Real time traffic sign detection algorithm based on improved YOLOV3[J].Computer Engineering and Applications,2022,58(8):243-248. |
[1] | 罗向龙, 郭凰, 廖聪, 韩静, 王立新. 时空相关的短时交通流宽度学习预测模型[J]. 计算机工程与应用, 2022, 58(9): 181-186. |
[2] | 阿里木·赛买提, 斯拉吉艾合麦提·如则麦麦提, 麦合甫热提, 艾山·吾买尔, 吾守尔·斯拉木, 吐尔根·依不拉音. 神经机器翻译面对句长敏感问题的研究[J]. 计算机工程与应用, 2022, 58(9): 195-200. |
[3] | 陈一潇, 阿里甫·库尔班, 林文龙, 袁旭. 面向拥挤行人检测的CA-YOLOv5[J]. 计算机工程与应用, 2022, 58(9): 238-245. |
[4] | 方义秋, 卢壮, 葛君伟. 联合RMSE损失LSTM-CNN模型的股价预测[J]. 计算机工程与应用, 2022, 58(9): 294-302. |
[5] | 高广尚. 深度学习推荐模型中的注意力机制研究综述[J]. 计算机工程与应用, 2022, 58(9): 9-18. |
[6] | 马婷婷, 杨志霞, 叶俊佑. 鲁棒双参数化间隔支持向量机[J]. 计算机工程与应用, 2022, 58(9): 74-82. |
[7] | 吉梦, 何清龙. AdaSVRG:自适应学习率加速SVRG[J]. 计算机工程与应用, 2022, 58(9): 83-90. |
[8] | 石颉, 袁晨翔, 丁飞, 孔维相. SAR图像建筑物目标检测研究综述[J]. 计算机工程与应用, 2022, 58(8): 58-66. |
[9] | 熊风光, 张鑫, 韩燮, 况立群, 刘欢乐, 贾炅昊. 改进的遥感图像语义分割研究[J]. 计算机工程与应用, 2022, 58(8): 185-190. |
[10] | 杨锦帆, 王晓强, 林浩, 李雷孝, 杨艳艳, 李科岑, 高静. 深度学习中的单阶段车辆检测算法综述[J]. 计算机工程与应用, 2022, 58(7): 55-67. |
[11] | 王斌, 李昕. 融合动态残差的多源域自适应算法研究[J]. 计算机工程与应用, 2022, 58(7): 162-166. |
[12] | 谭暑秋, 汤国放, 涂媛雅, 张建勋, 葛盼杰. 教室监控下学生异常行为检测系统[J]. 计算机工程与应用, 2022, 58(7): 176-184. |
[13] | 张美玉, 刘跃辉, 侯向辉, 秦绪佳. 基于卷积网络的灰度图像自动上色方法[J]. 计算机工程与应用, 2022, 58(7): 229-236. |
[14] | 张壮壮, 屈立成, 李翔, 张明皓, 李昭璐. 基于时空卷积神经网络的数据缺失交通流预测[J]. 计算机工程与应用, 2022, 58(7): 259-265. |
[15] | 许杰, 祝玉坤, 邢春晓. 基于深度强化学习的金融交易算法研究[J]. 计算机工程与应用, 2022, 58(7): 276-285. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||