Computer Engineering and Applications ›› 2023, Vol. 59 ›› Issue (12): 62-76.DOI: 10.3778/j.issn.1002-8331.2209-0396
• Research Hotspots and Reviews • Previous Articles Next Articles
LIANG Tiankai, SU Xinduo, HUANG Yuheng, XU Tianshi, ZHANG Huajun, ZENG Bi
Online:
2023-06-15
Published:
2023-06-15
梁天恺,苏新铎,黄宇恒,徐天适,张华俊,曾碧
LIANG Tiankai, SU Xinduo, HUANG Yuheng, XU Tianshi, ZHANG Huajun, ZENG Bi. Survey on Intelligent Table Recognition[J]. Computer Engineering and Applications, 2023, 59(12): 62-76.
梁天恺, 苏新铎, 黄宇恒, 徐天适, 张华俊, 曾碧. 智能化表格识别技术综述[J]. 计算机工程与应用, 2023, 59(12): 62-76.
Add to citation manager EndNote|Ris|BibTeX
URL: http://cea.ceaj.org/EN/10.3778/j.issn.1002-8331.2209-0396
[1] 李国杰,程学旗.大数据研究:未来科技及经济社会发展的重大战略领域——大数据的研究现状与科学思考[J].中国科学院院刊,2012,27(6):647-657. LI G J,CHENG X Q.Research status and scientific thinking of big data[J].Bulletin of the Chinese Academy of Sciences,2012,27(6):647-657. [2] 雷寰宇.基于图像的表格识别问题研究[J].科技视界,2021(13):32-34. LEI H Y.Research on the problem of image-based table recognition[J].Science and Technology Vision,2021(13):32-34. [3] KIENINGER T G.Table structure recognition based on robust block segmentation[J].Document Recognition SPIE,1998,3305:22-32. [4] ZANIBBI R,BLOSTEIN D,CORDY J R.A survey of table recognition:models,observations,transformations,and inferences[J].International Journal on Document Analysis and Recognition,2004,7(1):1-16. [5] 郑冶枫,刘长松,丁晓青,等.基于有向单连通链的表格框线检测算法[J].软件学报,2002,13(4):790-796. ZHENG Y F,LIU C S,DING X Q,et al.A form frame-line detection algorithm based on directional single-connected chain[J].Journal of Software,2002,13(4):790-796. [6] 唐锐,邓建新,叶志兴,等.PDF文件的表格抽取研究综述[J].计算机应用与软件,2021,38(7):1-7. TANG R,DENG J X,YE Z X,et al.Survey of table extraction in PDF documents[J].Computer Applications and Software,2021,38(7):1-7. [7] 赵洪,肖洪,薛德军,等.Web表格信息抽取研究综述[J].现代图书情报技术,2008(3):24-31. ZHAO H,XIAO H,XUE D J,el at.A survey of the research on information extraction over Web tables[J].New Technology of Library and Information Service,2008(3):24-31. [8] GOBEL M,HASSAN T,ORO E,et al.ICDAR 2013 table competition[C]//2013 12th International Conference on Document Analysis and Recognition,2013:1449-1453. [9] 李艳霞,孙羽菲,张玉志.受限表格识别系统的研究[J].计算机工程与应用,2006,42(31):161-163. LI Y X,SUN Y F,ZHANG Y Z.Constrained form recognition system[J].Computer Engineering and Applications,2006,42(31):161-163. [10] GAO L,HUANG Y,DEJEAN H,et al.ICDAR 2019 competition on table detection and recognition[C]//2019 International Conference on Document Analysis and Recognition(ICDAR),2019:1510-1515. [11] ZHANG D,MAO R,GUO R,et al.YOLO-table:disclosure document table detection with involution[J].International Journal on Document Analysis and Recognition(IJDAR),2022:1-14. [12] RAJA S,MONDAL A,JAWAHAR C V.Table structure recognition using top-down and bottom-up cues[C]//European Conference on Computer Vision,2020:70-86. [13] HASHMI K A,STRICKER D,LIWICKI M,et al.Guided table structure recognition through anchor optimization[J].IEEE Access,2021,9:113521-113534. [14] GOBEL M,HASSAN T,ORO E,et al.Table modelling,extraction and processing[C]//Proceedings of the 2016 ACM Symposium on Document Engineering,2016:1-2. [15] 王行荣,应俊.手写表格识别系统研究和实现[J].计算机科学,2008,35(6):268-271. WANG X R,YING J.Research and implementation of handwritten form recognition system[J].Computer Science,2008,35(6):268-271. [16] CHEN X,CHITICARIU L,DANILEVSKY M,et al.A rectangle mining method for understanding the semantics of financial tables[C]//2017 14th IAPR International Conference on Document Analysis and Recognition(ICDAR),2017:268-273. [17] REZA M M,BUKHARI S S,JENCKEL M,et al.Table localization and segmentation using gan and cnn[C]//2019 International Conference on Document Analysis and Recognition Workshops(ICDARW),2019:152-157. [18] WATANABE T,LUO Q,SUGIE N.Layout recognition of multi-kinds of table-form documents[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,1995,17(4):432-445. [19] BUSINGER A,RUEDI T P,SOMMER C.On-table reconstruction of comminuted fractures of the radial head[J].Injury,2010,41(6):583-588. [20] KASAR T,BHOWMIK T K,BELAID A.Table information extraction and structure recognition using query patterns[C]//2015 13th International Conference on Document Analysis and Recognition,2015:1086-1090. [21] FERNANDES J,SIMSEK M,KANTARCI B,et al.TableDet:an end-to-end deep learning approach for table detection and table image classification in data sheet images[J].Neurocomputing,2022,468:317-334. [22] HU J,KASHI R,LOPRESTI D,et al.Why table ground truthing is hard[C]//Proceedings of Sixth International Conference on Document Analysis and Recognition,2001:129-133. [23] 李彬,赵连军,刘帅.表格图像特征目标识别技术的研究[J].科技视界,2016(23):105-106. LI B,ZHAO L J,LIU S.Research on target automatic recognition technology based on table image[J].Science and Technology Vision,2016(23):105-106. [24] 郝红永.基于图像技术的表格结构识别研究[D].西安:西安电子科技大学,2010. HAO Y H.Research on table structure recognition based on image technology[D].Xi’an:Xidian University,2010. [25] KAVASIDIS I,PINO C,PALAZZO S,et al.A saliency-based convolutional neural network for table and chart detection in digitized documents[C]//International Conference on Image Analysis and Processing.Cham:Springer,2019:292-302. [26] WEI D,LU H,ZHOU Y,et al.Image-based table cell detection:a novel table structure decomposition method with new dataset[C]//2020 25th International Conference on Pattern Recognition,2021:1-7. [27] ZHANG Z,ZHANG J,DU J,et al.Split,embed and merge:an accurate table structure recognizer[J].Pattern Recognition,2022,126:108565. [28] PAAB G,KONYA I.Machine learning for document structure recognition[M]//Modeling,learning,and processing of text technological data structures.Berlin,Heidelberg:Springer,2011:221-247. [29] 张建东,陈仕吉,徐小婷,等.基于词向量的PDF表格抽取研究[J].数据分析与知识发现,2021,5(8):34-44. ZHANG J D,CHEN S J,XU X T,et al.Extracting PDF tables based on word vectors[J].Data Analysis and Knowledge Discovery,2021,5(8):34-44. [30] NAMYSL M,ESSER A M,BEHNKE S,et al.Flexible table recognition and semantic interpretation systemy[EB/OL].[2022-09-05].https://arxiv.org/pdf/2105.11879.pdf. [31] HU J,KASHI R S,LOPRESTI D,et al.Evaluating the performance of table processing algorithms[J].International Journal on Document Analysis and Recognition,2002,4(3):140-153. [32] RAMEL J Y,CRUCIANU M,VINCENT N,et al.Detection,extraction and representation of tables[C]//Seventh International Conference on Document Analysis and Recognition,2003:374-378. [33] TIAN Z,SHEN C,CHEN H,et al.Fcos:a simple and strong anchor-free object detector[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2020,44(4):1922-1933. [34] KONG T,SUN F,LIU H,et al.Foveabox:beyound anchor-based object detection[J].IEEE Transactions on Image Processing,2020,29:7389-7398. [35] RAMAN N,SHAH S,VELOSO M.Synthetic document generator for annotation-free layout recognition[J].Pattern Recognition,2022,128:108660. [36] 袁鸿雁.基于本体的Web表格信息抽取技术的研究[J].青岛大学学报(自然科学版),2010,23(2):47-51. YUAN H Y.Study on information extraction technique with web tables based on ontology[J].Journal of Qingdao University(Natural Science Edition),2010,23(2):47-51. [37] 王泽强,陈义明.一种基于YOLOv3和数学形态学的表格检测方法[J].电脑知识与技术,2021,17(2):14-16. WANG Z Q,CHEN Y M.Detecting table based on YOLOv3 and morphological function[J].Computer Knowledge and Technology,2021,17(2):14-16. [38] ZOU Z,SHI Z,GUO Y,et al.Object detection in 20 years:a survey[J].arXiv:1905.05055,2019. [39] YU B,TAO D.Anchor cascade for efficient face detection[J].IEEE Transactions on Image Processing,2018,28(5):2490-2501. [40] ZHANG S,CHI C,YAO Y,et al.Bridging the gap between anchor-based and anchor-free detection via adaptive training sample selection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,2020:9759-9768. [41] MO N,YAN L,ZHU R,et al.Class-specific anchor based and context-guided multi-class object detection in high resolution remote sensing imagery with a convolutional neural network[J].Remote Sensing,2019,11(3):272. [42] ZHU C,CHEN F,SHEN Z,et al.Soft anchor-point object detection[C]//European Conference on Computer Vision.Cham:Springer,2020:91-107. [43] LIU L,OUYANG W,WANG X,et al.Deep learning for generic object detection:a survey[J].International Journal of Computer Vision,2020,128(2):261-318. [44] JIAO L,ZHANG F,LIU F,et al.A survey of deep learning-based object detection[J].IEEE Access,2019,7:128837-128868. [45] 马志远,余粟.基于Faster-RCNN网络的表格检测算法研究[J].智能计算机与应用,2020,10(12):24-27. MA Z Y,YU L.Table detection algorithm based on Faster-RCNN[J].Intelligent Computer and Applications,2020,10(12):24-27. [46] SUN N,ZHU Y,HU X.Faster R-CNN based table detection combining corner locating[C]//2019 International Conference on Document Analysis and Recognition(ICDAR),2019:1314-1319. [47] CAI Z,VASCONCELOS N.Cascade R-CNN:delving into high quality object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2018:6154-6162. [48] AGARWAL M,MONDAL A,JAWAHAR C V.Cdec-net:composite deformable cascade network for table detection in document images[C]//2020 25th International Conference on Pattern Recognition,2021:9491-9498. [49] LIN T Y,DOLLAR P,GIRSHICK R,et al.Feature pyramid networks for object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2017:2117-2125. [50] 应自炉,赵毅鸿,宣晨,等.多特征融合的文档图像版面分析[J].中国图象图形学报,2020,25(2):311-320. YING Z L,ZHAO Y H,XUAN C,et al.Layout analysis of document images based on multifeature fusion[J].Journal of Image and Graphics,2020,25(2):311-320. [51] CHENG G,HAN J.A survey on object detection in optical remote sensing images[J].ISPRS Journal of Photogrammetry & Remote Sensing,2016,117:11-28. [52] 孔令军,包云超,王茜雯,等.基于深度学习的表格检测识别算法综述[J].计算机与网络,2021,47(2):65-73. KONG L J,BAO Y C,WANG Q W,et al.A summary of table detection and recognition algorithms based on deep learning[J].Computer and Network,2021,47(2):65-73. [53] HUANG Y,YAN Q,LI Y,et al.A YOLO-based table detection method[C]//2019 International Conference on Document Analysis and Recognition(ICDAR),2019:813-818. [54] REDMON J,FARHADI A.YOLOv3:an incremental improvement[EB/OL].[2022-09-06].https://arxiv.org/pdf/1804. 02767.pdf. [55] LIN T Y,GOYAL P,GIRSHICK R,et al.Focal loss for dense object detection[C]//Proceedings of the IEEE International Conference on Computer Vision,2017:2980-2988. [56] FERRYGUN.Build a parser to extract the table in PDF document with RetinaNet[EB/OL].[2022-09-06].https://medium.com/@djajafer/pdf-table-extraction-with-keras-retinanet-173a13371e89. [57] JOSEPH K J,KHAN S,KHAN F S,et al.Towards open world object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,2021:5830-5840. [58] ZHU C,HE Y,SAVVIDES M.Feature selective anchor-free module for single-shot object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,2019:840-849. [59] 赵逸如,李捷,王巍,等.表格提取方法、装置、计算机设备、存储介质,上海:CN114283435A[P].2022-04-05. ZHAO Y R,LI J,WANG W,et al.Table extraction method,apparatus,computer equipment,storage medium,Shanghai:CN114283435A[P].2022-04-05. [60] COUASNON B.A generic document recognition method:application to table structure analysis in a general and in a specific way[J].International Journal of Document Analysis and Recognition,2006,8(2/3):111-122. [61] SIDDIQUI S A,KHAN P I,DENGEL A,et al.Rethinking semantic segmentation for table structure recognition in documents[C]//2019 International Conference on Document Analysis and Recognition,2019:1397-1402. [62] SIDDIQUI S A,FATEH I A,RIZVI S T R,et al.DeepTabStR:deep learning based table structure recognition[C]//2019 International Conference on Document Analysis and Recognition,2019:1403-1409. [63] TENSMEYER C,MORARIU V I,PRICE B,et al.Deep splitting and merging for table structure decomposition[C]//2019 International Conference on Document Analysis and Recognition,2019:114-121. [64] QASIM S R,MAHMOOD H,SHAFAIT F.Rethinking table recognition using graph neural networks[C]//2019 International Conference on Document Analysis and Recognition,2019:142-147. [65] LI Y,HUANG Z,YAN J,et al.GFTE:graph-based financial table extraction[C]//International Conference on Pattern Recognition.Cham:Springer,2021:644-658. [66] YE J,QI X,HE Y,et al.PingAn-VCGroup’s solution for ICDAR 2021 competition on scientific literature parsing task B:table recognition to HTML[EB/OL].[2022-09-07].https://arxiv.org/pdf/2105.01848. [67] 秦振海,谭守标,徐超.基于Web的表格信息抽取研究[J].计算机技术与发展,2010(2):217-220. QIN Z H,TAN S B,XU C.Study on tables information extraction based on Web[J].Computer Technology and Development,2010(2):217-220. [68] 胡大洋.基于启发式规则的多记录页面分隔符识别方法[J].软件导刊,2009(9):50-51. HU D Y.Multi-record page separator recognition method based on heuristic rules[J].Software Guide,2009(9):50-51. [69] KIENINGER T,?DENGEL A.The T-recs table recognition and analysis system[C]//Lecture Notes in Computer Science,1999:255-270. [70] 田翠华,张一平,胡志钢,等.PDF文档表格信息的识别与提取[J].厦门理工学院学报,2020,28(3):70-76. TIAN C H,ZHANG Y P,HU Z G,et al.Recognition and extraction of table information from PDF documents[J].Journal of Xiamen University of Technology,2020,28(3):70-76. [71] SCHREIBER S,AGNE S,WOLF I,et al.Deepdesrt:deep learning for detection and structure recognition of tables in document images[C]//2017 14th IAPR International Conference on Document Analysis and Recognition,2017:1162-1167. [72] VIVIANYZW,YONKE.Table structure recognition from tencent[EB/OL].[2022-09-08].https://zhuanlan.zhihu.com/p/69793742. [73] 小马过河.表格识别方法综述[EB/OL].[2022-09-10].https://zhuanlan.zhihu.com/p/385673899. Xiaoma Guohe.Survey of table recognition methods[EB/OL].[2022-09-10].https://zhuanlan.zhihu.com/p/385673899. [74] PALIWAL S S,VISHWANATH D,RAHUL R,et al.Tablenet:deep learning model for end-to-end table detection and tabular data extraction from scanned document images[C]//2019 International Conference on Document Analysis and Recognition,2019:128-133. [75] PRASAD D,GADPAL A,KAPADNI K,et al.Cascade-TabNet:an approach for end to end table detection and structure recognition from image-based documents[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops,2020:572-573. [76] 仵冀颖.三篇论文,纵览深度学习在表格识别中的最新应用[EB/OL].[2022-09-10].https://mp.weixin.qq.com/s/LtLOgfjM0vwTa9HO1qPNHw. WU J Y.Three papers,an overview of state-of-the-art applications of deep learning in table recognition[EB/OL].[2022-09-10].https://mp.weixin.qq.com/s/LtLOgfjM0vwTa9HO1qPNHw. |
[1] | CHEN Jishang, Abudukelimu Halidanmu, LIANG Yunze, Abulizi Abudukelimu, Aishan Mikelayi, GUO Wenqiang. Review of Application of Deep Learning in Symbolic Music Generation [J]. Computer Engineering and Applications, 2023, 59(9): 27-45. |
[2] | LI Jinchen, LI Yanling, GE Fengpei, LIN Min. Survey of Research on Intelligent System for Legal Domain [J]. Computer Engineering and Applications, 2023, 59(7): 31-50. |
[3] | ZHENG Jian, YU Xin. Parallel OPTICS by Using Mean Distance and Relevance Marks [J]. Computer Engineering and Applications, 2023, 59(5): 232-244. |
[4] | SUN Shukui, FAN Jing, LI Zhanwen, QU Jinshuai, LU Peidong. Survey of Artificial Intelligence in COVID-19 Pandemic [J]. Computer Engineering and Applications, 2023, 59(5): 28-39. |
[5] | ZHAO Liyang, CHANG Tianqing, CHU Kaixuan, GUO Libin, ZHANG Lei. Survey of Fully Cooperative Multi-Agent Deep Reinforcement Learning [J]. Computer Engineering and Applications, 2023, 59(12): 14-27. |
[6] | WANG Zheng’an, XU Zhenshun, LIN Lingde. Review of COVID-19 Propagation Prediction Methods [J]. Computer Engineering and Applications, 2023, 59(12): 49-61. |
[7] | ZHANG Qiyang, CHEN Xiliang, CAO Lei, LAI Jun. Improved Policy Optimization Algorithm Based on Curiosity Mechanism [J]. Computer Engineering and Applications, 2023, 59(11): 63-70. |
[8] | WU Zhou, ZHANG Hongrui, ZHANG Haijun, SONG Qing. Summary of Research and Application of Neighborhood Field Optimization Algorithm [J]. Computer Engineering and Applications, 2022, 58(9): 1-8. |
[9] | CAI Qiming, ZHANG Lei, XU Chenhao. Research of Process Similarity Based on Single-Layer Neural Network [J]. Computer Engineering and Applications, 2022, 58(7): 295-302. |
[10] | WANG Yu, WANG Xin, ZHANG Shujuan, ZHENG Guoqiang, ZHAO Long, ZHENG Gaofeng. Research on Efficient Knowledge Fusion Method for Heterogeneous Big Data Environments [J]. Computer Engineering and Applications, 2022, 58(6): 142-148. |
[11] | CHEN Zhili, GAO Hao, PAN Yixuan, XING Feng. Review of Computer Aided Diagnosis Technology in Mammography [J]. Computer Engineering and Applications, 2022, 58(4): 1-21. |
[12] | JU Sibo, XU Jing, LI Yanfang. Text-to-Single Image Method Based on Self-Attention [J]. Computer Engineering and Applications, 2022, 58(3): 249-258. |
[13] | ZHANG Yutong, LI Qiyuan, LIU Shukan. Overview of Table Detection and Structure Recognition [J]. Computer Engineering and Applications, 2022, 58(22): 1-11. |
[14] | LIU Zhifei, CAO Lei, LAI Jun, CHEN Xiliang, CHEN Ying. Overview of Multi-Agent Path Finding [J]. Computer Engineering and Applications, 2022, 58(20): 43-64. |
[15] | WANG Ruiping, WU Shihong, ZHANG Meihang, WANG Xiaoping. Review of Language Processing Methods for Visual Question Answering [J]. Computer Engineering and Applications, 2022, 58(17): 50-60. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||