Computer Engineering and Applications ›› 2022, Vol. 58 ›› Issue (19): 53-63.DOI: 10.3778/j.issn.1002-8331.2201-0374
• Research Hotspots and Reviews • Previous Articles Next Articles
MA Yao, ZHI Min, YIN Yanjun, PING Ping
Online:
2022-10-01
Published:
2022-10-01
马瑶,智敏,殷雁君,萍萍
MA Yao, ZHI Min, YIN Yanjun, PING Ping. Review of Applications of CNN and Transformer in Fine-Grained Image Recognition[J]. Computer Engineering and Applications, 2022, 58(19): 53-63.
马瑶, 智敏, 殷雁君, 萍萍. CNN和Transformer在细粒度图像识别中的应用综述[J]. 计算机工程与应用, 2022, 58(19): 53-63.
Add to citation manager EndNote|Ris|BibTeX
URL: http://cea.ceaj.org/EN/10.3778/j.issn.1002-8331.2201-0374
[1] WEI X S,CUI Q,YANG L,et al.RPC:A large-scale retail product checkout dataset[J].arXiv:1901.07249,2019. [2] WEI Y,TRAN S,XU S,et al.Deep learning for retail product recognition:Challenges and techniques[J].Computational Intelligence and Neuroscience,2020(11):1-23. [3] VAN HORN G,BRANSON S,FARRELL R,et al.Building a bird recognition app and large scale dataset with citizen scientists:The fine print in fine-grained dataset collection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2015:595-604. [4] LECUN Y,BENGIO Y,HINTON G.Deep learning[J].Nature,2015,521:436-444. [5] WEI X S,SONG Y Z,MAC AODHA O,et al.Fine-grained image analysis with deep learning:A survey[J].arXiv:2111.06119,2021. [6] 李祥霞,吉晓慧,李彬.细粒度图像分类的深度学习方法[J].计算机科学与探索,2021,15(10):1830-1842. LI X X,JI X H,LI B.Deep learning method for fine-grained image categorization[J].Journal of Frontiers of Computer Science and Technology,2021,15(10):1830-1842. [7] WANG Y,MORARIU V I,DAVIS L S.Learning a discriminative filter bank within acnn for fine-grained recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2018:4148-4157. [8] GE W,LIN X,YU Y.Weakly supervised complementary parts models for fine-grained image classification from the bottom up[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,2019:3034-3043. [9] DOSOVITSKIY A,BEYER L,KOLESNIKOV A,et al.An image is worth 16×16 words:Transformers for image recognition at scale[J].arXiv:2010.11929,2020. [10] WAH C,BRANSON S,WELINDER P,et al.The Caltech-UCSD Birds-200-2011 dataset[D].California Institute of Technology,2011. [11] BERG T,LIU J,WOO LEE S,et al.Birdsnap:Large-scale fine-grained visual categorization of birds[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2014:2011-2018. [12] KHOSLA A,JAYADEVAPRAKASH N,YAO B,et al.Novel dataset for fine-grained image categorization:Stanford dogs[C]//Proceedings of the CVPR Workshop on Fine-Grained Visual Categorization(FGVC),2011. [13] KRAUSE J,STARK M,DENG J,et al.3D object representations for fine-grained categorization[C]//Proceedings of the IEEE International Conference on Computer Vision Workshops,2013:554-561. [14] MAJI S,RAHTU E,KANNALA J,et al.Fine-grained visual classification of aircraft[J].arXiv:1306.5151,2013. [15] WEI X S,XIE C W,WU J,et al.Mask-CNN:Localizing parts and selecting descriptors for fine-grained bird species categorization[J].Pattern Recognition,2018,76:704-714. [16] XIE G S,ZHANG X Y,YANG W,et al.LG-CNN:From local parts to global discrimination for fine-grained recognition[J].Pattern Recognition,2017,71:118-131. [17] WANG Q,LI P,ZHANG L.G2DeNet:Global Gaussian distribution embedding network and its application to visual recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2017:2730-2739. [18] DENG J,DONG W,SOCHER R,et al.Imagenet:A large-scale hierarchical image database[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2009:248-255. [19] NILSBACK M E,ZISSERMAN A.Automated flower classification over a large number of classes[C]//Proceedings of the Sixth Indian Conference on Computer Vision,Graphics & Image Processing,2008:722-729. [20] BOSSARD L,GUILLAUMIN M,VAN GOOL L.Food-101-mining discriminative components with random forests[C]//Proceedings of the European Conference on Computer Vision,2014:446-461. [21] YANG L,LUO P,CHANGE LOY C,et al.A large-scale car dataset for fine-grained categorization and verification[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2015:3973-3981. [22] ZHOU F,LIN Y.Fine-grained image classification by exploring bipartite-graph labels[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2016:1124-1133. [23] LIU Z,LUO P,QIU S,et al.Deep fashion:Powering robust clothes recognition and retrieval with rich annotations[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2016:1096-1104. [24] HOU S,FENG Y,WANG Z.Vegfru:A domain-specific dataset for fine-grained visual categorization[C]//Proceedings of the IEEE International Conference on Computer Vision,2017:541-549. [25] VAN HORN G,MAC AODHA O,SONG Y,et al.The iNaturalist species classification and detection dataset[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2018:8769-8778. [26] SUN M,YUAN Y,ZHOU F,et al.Multi-attention multi-class constraint for fine-grained image recognition[C]//Proceedings of the European Conference on Computer Vision(ECCV),2018:805-821. [27] MWEBAZE E,GEBRU T,FROME A,et al.iCassava 2019 fine-grained visual categorization challenge[J].arXiv:1908.02900,2019. [28] MIN W,LIU L,LUO Z,et al.Ingredient-guided cascaded multi-attention network for food recognition[C]//Proceedings of the 27th ACM International Conference on Multimedia,2019:1331-1339. [29] MIN W,LIU L,WANG Z,et al.ISIA food-500:A dataset for large-scale food recognition via stacked global-local attention network[C]//Proceedings of the 28th ACM International Conference on Multimedia,2020:393-401. [30] BAI Y,CHEN Y,YU W,et al.Products-10k:A large-scale product recognition dataset[J].arXiv:2008.10545,2020. [31] VAN HORN G,COLE E,BEERY S,et al.Benchmarking representation learning for natural world image collections[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,2021:12884-12893. [32] SIMONELLI A,DE NATALE F,MESSELODI S,et al.Increasingly specialized ensemble of convolutional neural networks for fine-grained recognition[C]//Proceedings of the 25th IEEE International Conference on Image Processing,2018:594-598. [33] CHEN Y.Convolutional neural network for sentence classification[D].University of Waterloo,2015. [34] SIMONYAN K,ZISSERMAN A.Very deep convolutional networks for large-scale image recognition[J].arXiv:1409. 1556,2014. [35] HE K,ZHANG X,REN S,et al.Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2016:770-778. [36] LAM M,MAHASSENI B,TODOROVIC S.Fine-grained recognition as HSnet search for informative image parts[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2017:2520-2529. [37] DING Y,ZHOU Y,ZHU Y,et al.Selective sparse sampling for fine-grained image recognition[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision,2019:6599-6608. [38] SUN G,CHOLAKKAL H,KHAN S,et al.Fine-grained recognition:Accounting for subtle differences between similar classes[C]//Proceedings of the AAAI Conference on Artificial Intelligence,2020:12047-12054. [39] FU J,ZHENG H,MEI T.Look closer to see better:Recurrent attention convolutional neural network for fine-grained image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2017:4438-4446. [40] ZHENG H,FU J,MEI T,et al.Learning multi-attention convolutional neural network for fine-grained image recognition[C]//Proceedings of the IEEE International Conference on Computer Vision,2017:5209-5217. [41] ZHENG H,FU J,ZHA Z J,et al.Learning rich part hierarchies with progressive attention networks for fine-grained image recognition[J].IEEE Transactions on Image Processing,2019,29:476-488. [42] ZHENG H,FU J,ZHA Z J,et al.Looking for the devil in the details:Learning trilinear attention sampling network for fine-grained image recognition[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,2019:5012-5021. [43] JI R,WEN L,ZHANG L,et al.Attention convolutional binary neural tree for fine-grained visual categorization[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,2020:10468-10477. [44] WEI H,ZHU M,WANG B,et al.Two-level progressive attention convolutional network for fine-grained image recognition[J].IEEE Access,2020,8:104985-104995. [45] YANG Z,LUO T,WANG D,et al.Learning to navigate for fine-grained classification[C]//Proceedings of the European Conference on Computer Vision(ECCV),2018:420-435. [46] YAN T,WANG S,WANG Z,et al.Progressive learning for weakly supervised fine-grained classification[J].Signal Processing,2020,171:107519. [47] LIU C,XIE H,ZHA Z J,et al.Filtration and distillation:Enhancing region attention for fine-grained visual categorization[C]//Proceedings of the AAAI Conference on Artificial Intelligence,2020:11555-11562. [48] WANG Z,WANG S,YANG S,et al.Weakly super vised fine-grained image classification via Guassian mixture model oriented discriminative learning[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,2020:9749-9758. [49] WANG Z,WANG S,LI H,et al.Graph-propagation based correlation learning for weakly supervised fine-grained image classification[C]//Proceedings of the AAAI Conference on Artificial Intelligence,2020:12289-12296. [50] ZHENG H,FU J,ZHA Z J,et al.Learning deep bilinear transformation for fine-grained image representation[J].arXiv:1911.03621,2019. [51] LI X,YANG C,CHEN S L,et al.Semantic bilinear pooling for fine-grained recognition[C]//Proceedings of the 25th International Conference on Pattern Recognition,2021:3660-3666. [52] YU C,ZHAO X,ZHENG Q,et al.Hierarchical bilinear pooling for fine-grained visual recognition[C]//Proceedings of the European Conference on Computer Vision,2018:574-589. [53] LUO W,YANG X,MO X,et al.Cross-X learning for fine-grained visual categorization[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision,2019:8242-8251. [54] CUI Y,ZHOU F,WANG J,et al.Kernel pooling for convolutional neural networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2017:2921-2930. [55] CAI S,ZUO W,ZHANG L.Higher-order integration of hierarchical convolutional activations for fine-grained visual categorization[C]//Proceedings of the IEEE International Conference on Computer Vision,2017:511-520. [56] CHANG D,DING Y,XIE J,et al.The devil is in the channels:Mutual-channel loss for fine-grained image classification[J].IEEE Transactions on Image Processing,2020,29:4683-4695. [57] ZHANG L,HUANG S,LIU W,et al.Learning a mixture of granularity-specific experts for fine-grained categorization[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision,2019:8331-8340. [58] CHEN Y,BAI Y,ZHANG W,et al.Destruction and construction learning for fine-grained image recognition[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,2019:5157-5166. [59] RAO Y,CHEN G,LU J,et al.Counterfactual attention learning for fine-grained visual categorization and re-identification[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision,2021:1025-1034. [60] PENG Y,HE X,ZHAO J.Object-part attention model for fine-grained image classification[J].IEEE Transactions on Image Processing,2017,27(3):1487-1500. [61] ZHAO Y,YAN K,HUANG F,et al.Graph-based high-order relation discovery for fine-grained recognition[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,2021:15079-15088. [62] ZHUANG P,WANG Y,QIAO Y.Learning attentive pairwise interaction for fine-grained classification[C]//Proceedings of the AAAI Conference on Artificial Intelligence,2020:13130-13137. [63] GAO Y,HAN X,WANG X,et al.Channel interaction networks for fine-grained image categorization[C]//Proceedings of the AAAI Conference on Artificial Intelligence,2020:10818-10825. [64] ZHANG T,CHANG D,MA Z,et al.Progressive co-attention network for fine-grained visual classification[J].arXiv:2101.08527,2021. [65] XU J,WEI Y,DENG W.Feature correlation residual network for fine-grained image recognition[J].IEEE Access,2020,8:214322-214331. [66] CARION N,MASSA F,SYNNAEVE G,et al.End-to-end object detection with transformers[C]//Proceedings of the European Conference on Computer Vision,2020:213-229. [67] ZHENG S,LU J,ZHAO H,et al.Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,2021:6881-6890. [68] HE J,CHEN J N,LIU S,et al.TransFG:A transformer architecture for fine-grained recognition[J].arXiv:2103. 07976,2021. [69] ZHANG Y,CAO J,ZHANG L,et al.A free lunch from ViT:Adaptive attention multi-scale fusion transformer for fine-grained visual recognition[J].arXiv:2110.01240,2021. [70] LIU X,WANG L,HAN X.Transformer with peak suppression and knowledge guidance for fine-grained image recognition[J].arXiv:2107.06538,2021. [71] WANG J,YU X,GAO Y.Feature fusion vision transformer for fine-grained visual categorization[J].arXiv:2107.02341,2021. [72] CONDE M V,TURGUTLU K.Exploring vision transformers for fine-grained classification[J].arXiv:2106. 10587,2021. |
[1] | GAO Guangshang. Survey on Attention Mechanisms in Deep Learning Recommendation Models [J]. Computer Engineering and Applications, 2022, 58(9): 9-18. |
[2] | JI Meng, HE Qinglong. AdaSVRG: Accelerating SVRG by Adaptive Learning Rate [J]. Computer Engineering and Applications, 2022, 58(9): 83-90. |
[3] | LUO Xianglong, GUO Huang, LIAO Cong, HAN Jing, WANG Lixin. Spatiotemporal Short-Term Traffic Flow Prediction Based on Broad Learning System [J]. Computer Engineering and Applications, 2022, 58(9): 181-186. |
[4] | HU Zhangfang, JIAN Fang, TANG Shanshan, MING Ziping, JIANG Bowen. DFSMN-T:Mandarin Speech Recognition with Language Model Transformer [J]. Computer Engineering and Applications, 2022, 58(9): 187-194. |
[5] | Alim Samat, Sirajahmat Ruzmamat, Maihefureti, Aishan Wumaier, Wushuer Silamu, Turgun Ebrayim. Research on Sentence Length Sensitivity in Neural Network Machine Translation [J]. Computer Engineering and Applications, 2022, 58(9): 195-200. |
[6] | CHEN Yixiao, Alifu·Kuerban, LIN Wenlong, YUAN Xu. CA-YOLOv5 for Crowded Pedestrian Detection [J]. Computer Engineering and Applications, 2022, 58(9): 238-245. |
[7] | FANG Yiqiu, LU Zhuang, GE Junwei. Forecasting Stock Prices with Combined RMSE Loss LSTM-CNN Model [J]. Computer Engineering and Applications, 2022, 58(9): 294-302. |
[8] | SHI Jie, YUAN Chenxiang, DING Fei, KONG Weixiang. Survey of Building Target Detection in SAR Images [J]. Computer Engineering and Applications, 2022, 58(8): 58-66. |
[9] | SUN Liujie, ZHAO Jin, WANG Wenju, ZHANG Yusen. Multi-Scale Transformer Lidar Point Cloud 3D Object Detection [J]. Computer Engineering and Applications, 2022, 58(8): 136-146. |
[10] | XIONG Fengguang, ZHANG Xin, HAN Xie, KUANG Liqun, LIU Huanle, JIA Jionghao. Research on Improved Semantic Segmentation of Remote Sensing [J]. Computer Engineering and Applications, 2022, 58(8): 185-190. |
[11] | YANG Jinfan, WANG Xiaoqiang, LIN Hao, LI Leixiao, YANG Yanyan, LI Kecen, GAO Jing. Review of One-Stage Vehicle Detection Algorithms Based on Deep Learning [J]. Computer Engineering and Applications, 2022, 58(7): 55-67. |
[12] | WANG Bin, LI Xin. Research on Multi-Source Domain Adaptive Algorithm Integrating Dynamic Residuals [J]. Computer Engineering and Applications, 2022, 58(7): 162-166. |
[13] | TAN Shuqiu, TANG Guofang, TU Yuanya, ZHANG Jianxun, GE Panjie. Classroom Monitoring Students Abnormal Behavior Detection System [J]. Computer Engineering and Applications, 2022, 58(7): 176-184. |
[14] | ZHANG Meiyu, LIU Yuehui, HOU Xianghui, QIN Xujia. Automatic Coloring Method for Gray Image Based on Convolutional Network [J]. Computer Engineering and Applications, 2022, 58(7): 229-236. |
[15] | ZHANG Zhuangzhuang, QU Licheng, LI Xiang, ZHANG Minghao, LI Zhaolu. Traffic Flow Prediction with Missing Data Based on Spatial-Temporal Convolutional Neural Networks [J]. Computer Engineering and Applications, 2022, 58(7): 259-265. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||