Computer Engineering and Applications ›› 2023, Vol. 59 ›› Issue (20): 147-157.DOI: 10.3778/j.issn.1002-8331.2211-0456
• Graphics and Image Processing • Previous Articles Next Articles
XIANG Jianwen, CHEN Minrong, YANG Baibing
Online:
2023-10-15
Published:
2023-10-17
项剑文,陈泯融,杨百冰
XIANG Jianwen, CHEN Minrong, YANG Baibing. Fine-Grained Image Classification Combining Swin and Multi-Scale Feature Fusion[J]. Computer Engineering and Applications, 2023, 59(20): 147-157.
项剑文, 陈泯融, 杨百冰. 结合Swin及多尺度特征融合的细粒度图像分类[J]. 计算机工程与应用, 2023, 59(20): 147-157.
Add to citation manager EndNote|Ris|BibTeX
URL: http://cea.ceaj.org/EN/10.3778/j.issn.1002-8331.2211-0456
[1] ZHENG H L,FU J L,ZHA Z J,et al.Learning deep bilinear transformation for fine-grained image representation[C]//Proceedings of Annual Conference on Neural Information Processing Systems 2019,Dec 8-14,2019,Vancouver,BC,Canada,2019:4279-4288. [2] GE W F,LIN X R,YU Y Z.Weakly supervised complementary parts models for fine-grained image classification from the bottom up[C]//Proceedings of the 2019 IEEE Conference on Computer Vision and Pattern Recognition,2019:3034-3043. [3] LIN T Y,ROYCHOWDHURY A,MAJI S.Bilinear CNN models for fine-grained visual recognition[C]//Proceedings of the 2015 IEEE International Conference on Computer Vision,Santiago,Chile,December 7-13,2015:1449-1457. [4] ZHENG S X,LU J C,ZHAO H S,et al.Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers[C]//Proceedings of the 2021 IEEE Conference on Computer Vision and Pattern Recognition,2021:6881-6890. [5] DOSOVITSKIY A,BEYER L,KOLESNIKOV A,et al.An image is worth 16×16 words:transformers for image recognition at scale[C]//Proceedings of the 9th International Conference on Learning Representations,Austria,May 3-7,2021. [6] LIU Z,LIN Y T,HU H,et al.Swin transformer:hierarchical vision transformer using shifted windows[C]//Proceedings of the 2021 IEEE International Conference on Computer Vision,Montreal,QC,Canada,Oct 10-17,2021:9992-10002. [7] HE J,CHEN J N,LIU S,et al.TransFG:a transformer architecture for fine-grained recognition[C]//Proceedings of the 36th AAAI Conference on Artificial Intelligence,the 34th Conference on Innovative Application of Artificial Intelligence,the 12th Symposium on Educational Advances in Artificial Intelligence,Feb 22-March 1,2022:852-860. [8] SONG J W,YANG R Y.Feature boosting,suppression,and diversification for fine-grained visual classification[C]//Proceedings of International Joint Conference on Neural Networks,Shenzhen,China,July 18-22,2021:1-8. [9] LIN T Y,DOLLAR P,GIRSHICK R,et al.Feature pyramid networks for object detection[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition(CVPR),July 21-26 2017,Honolulu,HI,2017:936-944. [10] LIU W,ANGUELOV D,ERHAN D,et al.SSD:single shot multibox detector[C]//Computer Vision ECCV 2016.Cham:Springer International Publishing,2016:21-37. [11] CHEN X S,FU C M,ZHAO Y,et al.Salience-guided cascaded suppression network for person re-identification[C]//2020 IEEE Conference on Computer Vision and Pattern Recognition(CVPR),June 13-19,2020,Seattle,WA,2020:3297-3307. [12] GUO M H,LIU Z N,MU J T,et al.Beyond self-attention:external attention using two linear layers for visual tasks[J].arXiv:2105.02358,2021. [13] WAH C,BRANSON S,WELINDER P,et al.The Caltech-UCSD Birds-200-2011 dataset[R].Pasadena:California Institute of Technology,2011. [14] HORN V G,BRANSON S,HABER S,et al.Building a bird recognition app and large scale dataset with citizen scientists:the fine print in fine-grained dataset collection[C]//Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition,Boston,June 7-12,2015:595-604. [15] SUN Z,YAO Y Z,WEI X S,et al.Webly supervised fine-grained recognition:benchmark datasets and an approach[C]//Proceedings of 2021 IEEE International Conference on Computer Vision,Montreal,QC,Canada,Oct 10-17,2021:10582-10591. [16] 魏秀参,许玉燕,杨健.网络监督数据下的细粒度图像识别综述[J].中国图象图形学报,2022,27(7):2057-2077. WEI X S,XU Y Y,YANG J.Review of webly-supervised fine-grained image recognition[J].Journal of Image and Graphics,2022,27(7):2057-2077. [17] SUTSKEVER I,MARTENS J,DAHL G E,et al.On the importance of initialization and momentum in deep learning[C]//Proceedings of the 30th International Conference on Machine Learning,Atlanta,GA,June 16-21,2013:1139-1147. [18] HE K,ZHANG X Y,REN S Q,et al.Deep residual learning for image recognition[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition,Las Vegas,June 27-30,2016:770-778. [19] DUBEY A,GUPTA O,RASKAR R,et al.Maximum-entropy fine grained classification[C]//Proceedings of the Annual Conference on Neural Information Processing System,Montréal,Dec 3-8,2018:635-645. [20] LUO W,YANG X T,MO X J,et al.Cross-x learning for fine-grained visual categorization[C]//Proceedings of the International Conference on Computer Vision,Oct 27-Nov 2,2019:8241-8250. [21] 李宽宽,刘立波.双线性聚合残差注意力的细粒度图像分类模型[J].计算机科学与探索,2022,16(4):938-949. LI K K,LIU L B.Fine-grained image classification model based on bilinear aggregate residual attention[J].Journal of Frontiers of Computer Science and Technology,2022,16(4):938-949. [22] 丁文谦,余鹏飞,李海燕,等.基于Xception网络的弱监督细粒度图像分类[J].计算机工程与应用,2022,58(2):235-243. DING W Q,YU P F,LI H Y,et al.Weakly supervised fine-grained image classification based on Xception network[J].Computer Engineering and Applications,2022,58(2):235-243. [23] DU R Y,CHANG D L,BHUNIA A K,et al.Fine-grained visual classification via progressive multi-granularity training of jigsaw patches[C]//Proceedings of the 16th European Conference on Computer Vision,Glasgow,Aug 23-28,2020:153-168. [24] ZHUANG P Q,WANG Y L,QIAO Y.Learning attentive pairwise interaction for fine-grained classification[C]//Proceedings of the 34th AAAI Conference on Artificial Intelligence,the 32nd Conference on Innovative Application of Artificial Intelligence,the 10th Symposium on Educational Advances in Artificial Intelligence,New York,Feb 7-12,2020:13130-13137. [25] WANG J,YU X H,GAO Y S.Feature fusion vision transformer for fine-grained visual categorization[C]//Proceedings of 32nd British Machine Vision Conference,Nov 22-25,2021:170. [26] CAI C L,ZHANG T K,WENG Z W,et al.A transformer architecture with adaptive attention for fine-grained visual classification[C]//Proceedings of the 7th International Conference on Computer and Communications,2021:863-867. [27] KORSCH D,BODESHEIM P,DENALER J.Classification-specific parts for improving fine-grained visual categorization[C]//Proceedings of the 41th German Conference Pattern Recognition,Dortmund,Germany,Sep 10-13,2019:62-75. [28] ZHANG L B,HUANG S L,TAO D H.Learning a mixture of granularity-specific experts for fine-grained categorization[C]//Proceedings of International Conference on Computer Vision,Seoul,Korea(South),Oct 27-Nov 2,2019:8330-8339. [29] TOUVRON H,VEDALDI A,DOUZE M,et al.Fixing the train-test resolution discrepancy[C]//Proceedings of Annual Conference on Neural Information Processing Systems,Vancouver,Dec 8-14,2019:8250-8260. [30] BEHERA A,WHARTON Z,HEWAGE P R P G,et al.Context-aware attentional pooling (cap) for fine-grained visual classification[C]//Proceedings of the 35th AAAI Conference on Artificial Intelligence,the 33rd Conference on Innovative Application of Artificial Intelligence,the 11th Symposium on Educational Advances in Artificial Intelligence,Feb 2-9,2021:929-937. [31] KORSCH D,BODESHEIM P,DENZLER J.End-to-end learning of a fisher vector encoding for part features in fine-grained recognition[J].arXiv:2007.02080,2020. [32] SIMONYAN K,ZISSERMAN A.Very deep convolutional networks for large-scale image recognition[C]//Proceedings of the 3rd International Conference on Learning Representations,San Diego,May 7-9,2015. [33] SZEGEDY C,LIU W,JIA Y Q,et al.Going deeper with convolutions[C]//Proceedings of Conference on Computer Vision and Pattern Recognition,Boston,June 7-12,2015:1-9. [34] MALACH E,SHALEV-SHWARTZ S.Decoupling“when to update” from“ how to update”[C]//Proceedings of Annual Conference on Neural Information Processing Systems,Long Beach,Dec 4-9,2017:960-970. [35] HAN B,YAO Q M,YU X R,et al.Co-teaching:robust training of deep neural networks with extremely noisy labels[C]//Proceedings of Annual Conference on Neural Information Processing Systems,Montréal,Dec 3-8,2018:8536-8546. [36] SHU J,XIE Q,YI L X,et al.Meta-Weight-Net:learning an explicit mapping for sample weighting[C]//Proceedings of Annual Conference on Neural Information Processing Systems,Vancouver,Dec 8-14,2019:1917-1928. [37] SHU J,YUAN X,XU Z B.CMW-Net:learning a class-aware sample weighting mapping for robust deep learning[J].arXiv:2202.05613,2022. [38] SELVARAJU R R,COGSWELL M,DAS A,et al.Grad-cam:visual explanations from deep networks via gradient-based localization[C]//Proceedings of the 2017 IEEE International Conference on Computer Vision,Venice,Oct 22-29,2017.Washington:IEEE Computer Society,2017:618-626. |
[1] | LIU Hualing, CHEN Shanghui, QIAO Liang, LIU Yaxin. Multimodal False News Detection Based on Fusion Attention Mechanism [J]. Computer Engineering and Applications, 2023, 59(9): 95-103. |
[2] | XIN Miaomiao, MA Li, HU Bofa. Research on Text Classification by Fusing Multi-Granularity Information [J]. Computer Engineering and Applications, 2023, 59(9): 104-111. |
[3] | LUO Huilan, CHEN Han. Spatial-Temporal Convolutional Attention Network for Action Recognition [J]. Computer Engineering and Applications, 2023, 59(9): 150-158. |
[4] | LI Jinrong, LYU Guoying, LI Ru, CHAI Qinghua, WANG Chao. Chinese Negative Semantic Representation and Annotation Combined with Hybrid Attention Mechanism and BiLSTM-CRF [J]. Computer Engineering and Applications, 2023, 59(9): 167-175. |
[5] | XIE Chunhui, WU Jinming, XU Huaiyu. Small Object Detection Algorithm Based on Improved YOLOv5 in UAV Image [J]. Computer Engineering and Applications, 2023, 59(9): 198-206. |
[6] | LI Kunya, OU Ou, LIU Guangbin, YU Zefeng, LI Lin. Target Detection Algorithm of Remote Sensing Image Based on Improved YOLOv5 [J]. Computer Engineering and Applications, 2023, 59(9): 207-214. |
[7] | LI Wenju, CHU Wanghui, CUI Liu, SU Pan, ZHANG Gan. 3D Object Detection Method Combining on Graph Sampling and Graph Attention [J]. Computer Engineering and Applications, 2023, 59(9): 237-244. |
[8] | ZENG Xi, XIN Yuelan, XIE Qiqi. Multi-Branch Network Facial Expression Recognition Based on Gender Constraint [J]. Computer Engineering and Applications, 2023, 59(9): 245-254. |
[9] | MEI Yuzhu, HU Zhulin, ZHU Xinjuan. Research on Group Preference Fusion Strategy Based on Two-Layer Attention Mechanism [J]. Computer Engineering and Applications, 2023, 59(9): 272-279. |
[10] | JIN Zhi, ZHANG Qian, LI Xiying. Dense Road Vehicle Detection Based on Lightweight ConvLSTM [J]. Computer Engineering and Applications, 2023, 59(8): 89-96. |
[11] | JI Ruirui, XIE Yuhui, LUO Fengkai, MEI Yuan. Face Recognition Method Based on Improved Visual Transformer [J]. Computer Engineering and Applications, 2023, 59(8): 117-126. |
[12] | ZHAO Ping, DOU Quansheng, TANG Huanling, JIANG Ping, CHEN Shuzhen. Attention Adaptive Model with Word Information Embeding for Named Entity Recognition [J]. Computer Engineering and Applications, 2023, 59(8): 167-174. |
[13] | CUI Shaoguo, DU Xiao, YANG Zetian. Neural Recommendation Algorithm Using Combinations of Low and High-Order Features Based on Multi-Attention Mechanism [J]. Computer Engineering and Applications, 2023, 59(8): 192-199. |
[14] | ZHANG Zhaoyang, ZHANG Shang, WANG Hengtao, RAN Xiukang. Multi-Head Attention Detection of Small Targets in Remote Sensing at Multiple Scales [J]. Computer Engineering and Applications, 2023, 59(8): 227-238. |
[15] | ZHANG Xu, YANG Xuezhi, LIU Xuenan, FANG Shuai. Non-Contact Atrial Fibrillation Detection Based on Video Pulse Features [J]. Computer Engineering and Applications, 2023, 59(8): 331-340. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||