Survey of  Vision Transformer in Fine-Grained Image Classification

doi:10.3778/j.issn.1002-8331.2310-0395

Abstract

Abstract: Fine-grained image classification (FGIC) has always been an important problem in computer vision. Compared to traditional image classification tasks, FGIC faces the challenge of extremely similar inter-class objects, which further increases the difficulty of the task. With the development of deep learning, Vision Transformer (ViT) models have become popular in the field of vision and have been introduced into FGIC tasks. This paper introduces the challenges faced by FGIC tasks, provides an overview of the ViT model, and analyzes its characteristics. The comprehensive review is primarily based on the model structure and covers FGIC algorithms based on ViT. It includes feature extraction, feature relation modeling, feature attention, and feature enhancement as the main aspects. Each algorithm is summarized, and its advantages and disadvantages are analyzed. Following that, a comparison of the performance of different ViT models on the same public dataset is conducted to validate their effectiveness in the FGIC tasks. Furthermore, the limitations of current research are pointed out, and future research directions are proposed to further explore the potential of ViT in FGIC.

Key words: fine-grained image classification, Vision Transformer, feature extraction, feature relation modeling, feature attention, feature enhancement

摘要： 细粒度图像分类（fine-grained image classification，FGIC）一直是计算机视觉领域中的重要问题。与传统图像分类任务相比，FGIC的挑战在于类间对象极其相似，使任务难度进一步增加。随着深度学习的发展，Vision Transformer（ViT）模型在视觉领域掀起热潮，并被引入到FGIC任务中。介绍了FGIC任务所面临的挑战，分析了ViT模型及其特性。主要根据模型结构全面综述了基于ViT的FGIC算法，包括特征提取、特征关系构建、特征注意和特征增强四方面内容，对每种算法进行了总结，并分析了它们的优缺点。通过对不同ViT模型在相同公用数据集上进行模型性能比较，以验证它们在FGIC任务上的有效性。最后指出了目前研究的不足，并提出未来研究方向，以进一步探索ViT在FGIC中的潜力。

关键词: 细粒度图像分类, Vision Transformer, 特征提取, 特征关系构建, 特征注意, 特征增强

SUN Lulu, LIU Jianping, WANG Jian, XING Jialu, ZHANG Yue, WANG Chenyang. Survey of Vision Transformer in Fine-Grained Image Classification[J]. Computer Engineering and Applications, 2024, 60(10): 30-46.

孙露露, 刘建平, 王健, 邢嘉璐, 张越, 王晨阳. 细粒度图像分类上Vision Transformer的发展综述[J]. 计算机工程与应用, 2024, 60(10): 30-46.

References

[1] 李祥霞, 吉晓慧, 李彬. 细粒度图像分类的深度学习方法[J]. 计算机科学与探索, 2021, 15(10): 1830-1842.
LI X X, JI X H, LI B. Deep learning method for fine-grained image categorization[J]. Journal of Frontiers of Computer Science and Technology, 2021, 15(10): 1830-1842.
[2] ELINDER P, BRANSON S, MITA T, et al. The CalTech-UCSD birds-200-2011 dataset[R]. California Institute of Technology, 2011.
[3] KHOSLA A, JAYADEVAPRAKASH N, YAO B, et al. Novel dataset for fine-grained image categorization: Stanford dogs[C]//Proceedings of the 2011 CVPR Workshop on Fine-Grained Visual Categorization, 2011.
[4] KRAUSE J, STARK M, DENG J, et al. 3D object representations for fine-grained categorization[C]//Proceedings of the 2013 IEEE International Conference on Computer Vision Workshops, Sydney, 2013: 554-561.
[5] LUO W, YANG X T, MO X H, et al. Cross-X learning for fine-grained visual categorization[C]//Proceedings of the 2019 IEEE International Conference on Computer Vision, Seoul, 2019: 8241-8250.
[6] GAO Y, HAN X T, WANG X, et al. Channel interaction networks for fine-grained image categorization[C]//Proceedings of the 2022 3rd International Conference on Computer Vision, Image and Deep Learning & International Conference on Computer Engineering and Applications, Changchun, 2022: 606-611.
[7] 何凯, 冯旭, 高圣楠, 等. 基于多尺度特征融合与反复注意力机制的细粒度图像分类算法[J]. 天津大学学报 (自然科学与工程技术版), 2020, 53(10): 1077-1085.
HE K, FENG X, GAO S N, et al. Fine-grained image classification algorithm using multi-scale feature fusion and re-attention mechanism[J]. Journal of Tianjin University (Science and Technology), 2020, 53(10): 1077-1085.
[8] ZHANG Y. A fine-grained image classification and detection method based on convolutional neural network fused with attention mechanism[J]. Computational Intelligence and Neuroscience, 2022: 2974960.
[9] ZENG R, HE J S. Grouping bilinear pooling for fine-grained image classification[J]. Applied Sciences, 2022, 12(10): 5063.
[10] 解耀华, 章为川, 任劼, 等. 基于自适应特征融合的小样本细粒度图像分类[J]. 计算机工程与应用, 2023, 59(3): 184-192.
XIE J H, ZHANG W C, REN J, et al. Adaptive feature fusion embedding network for few shot fine-grained image classification[J]. Computer Engineering and Applications, 2023, 59(3): 184-192.
[11] YU C J, ZHAO X Y, ZHENG Q, et al. Hierarchical bilinear pooling for fine-grained visual recognition[C]//Proceedings of the 15th European Conference on Computer Vision. Cham: Springer, 2018: 595-610.
[12] SONG J W, YANG R Y. Feature boosting, suppression, and diversification for fine-grained visual classification[C]//Proceedings of the 2021 International Joint Conference on Neural Networks, Shenzhen, 2021: 1-8.
[13] LIU D C, WANG Y, MASE K J, et al. Recursive multi-scale channel-spatial attention for fine-grained image classification[J]. IEICE Transactions on Information and Systems, 2022, 105-D(3): 713-726.
[14] ZHUANG P Q, WANG Y L, QIAO Y. Learning attentive pairwise interaction for fine-grained classification[C]//Proceedings of the 34th AAAI Conference on Artificial Intelligence, 2020: 13130-13137.
[15] LI Z H, GU T C, LI B, et al. ConvNeXt-based fine-grained image classification and bilinear attention mechanism model[J]. Applied Sciences, 2022, 12(18): 9016.
[16] LIU M, ZHANG C J, BAI H H, et al. Cross-part learning for fine-grained image classification[J]. IEEE Transactions on Image Processing, 2022, 31: 748-758.
[17] LIU C B, XIE H T, ZHA Z J, et al. Filtration and distillation: enhancing region attention for fine-grained visual categorization[C]//Proceedings of the 34th AAAI Conference on Artificial Intelligence, 2020: 11555-11562.
[18] 廖开阳, 黄港, 郑元林, 等. 互补注意多样性特征融合网络的细粒度分类[J]. 中国图象图形学报, 2023, 28(8): 2420-2431.
LIAO K Y, HUANG G, ZHENG Y L, et al. Fine-grained classification of complementary attention diversity feature fusion network[J]. Journal of Image and Graphics, 2023, 28(8): 2420-2431.
[19] 张文轩, 吴秦. 基于多分支注意力增强的细粒度图像分类[J]. 计算机科学, 2022, 49(5): 105-112.
ZAHNG W X, WU Q. Fine-grained image classification based on multi-branch attention-augmentation[J]. Computer Science, 2022, 49(5): 105-112.
[20] 吕冬健, 王春立. 可变尺寸循环注意力模型及应用研究[J]. 计算机工程与应用, 2022, 58(12): 243-248.
LYU D J, WANG L C. Variable size for recurrent attention model and application research[J]. Computer Engineering and Applications, 2022, 58(12): 243-248.
[21] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]//Advances in Neural Information Processing Systems 30, 2017: 6000-6010.
[22] DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16x16 words: transformers for image recognition at scale[C]//Proceedings of the 9th International Conference on Learning Representations, May 3-7, 2021.
[23] 郑世杰, 王高才. 基于ConvNeXt热图定位和对比学习的细粒度图像分类研究[J]. 计算机科学, 2023, 50(10): 119-125.
ZHENG S J, WANG G C. Study on fine-grained image classification based on ConvNeXt heatmap localization and contrastive learning[J]. Computer Science, 2023, 50(10): 119-125.
[24] 申志军, 穆丽娜, 高静, 等. 细粒度图像分类综述[J]. 计算机应用, 2023, 43(1): 51-60.
SHEN Z J, MU L N, GAO J, et al. Review of fine-grained image categorization[J]. Journal of Computer Applications, 2023, 43(1): 51-60.
[25] WEI X S, SONG Y Z, MAC AODHA O, et al. Fine-grained image analysis with deep learning: a survey[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44(12): 8927-8948.
[26] LIU Y, ZHANG Y, WANG Y X, et al. A survey of visual transformers[J]. IEEE Transactions on Neural Networks and Learning Systems, 2023. DOI:10.1109/TNNLS.2022. 3227717.
[27] 李清格, 杨小冈, 卢瑞涛, 等. 计算机视觉中的Transformer发展综述[J]. 小型微型计算机系统, 2023, 44(4): 850-861.
LI Q G, YANG X G, LU R T, et al. Transformer in computer vision: a survey[J]. Journal of Chinese Computer Systems, 2023, 44(4): 850-861.
[28] 周丽娟, 毛嘉宁. 视觉Transformer识别任务研究综述[J]. 中国图象图形学报, 2023, 28(10): 2969-3003.
ZHOU L J, MAO Y N. Vision Transformer-based recognition tasks: a critical review[J]. Journal of Image and Graphics, 2023, 28(10): 2969-3003.
[29] ZHANG Y, CHEN W, ZANG Y. Fine-grained vision categorization with vision transformer: a survey[C]//Proceedings of the 2022 IEEE 8th International Conference on Computer and Communications, Chengdu, 2022: 1910-1915.
[30] KUMAR K G S, VENKATESAN A, SELVARAJ D, et al. Rapid and accurate diagnosis of covid-19 cases from chest X-ray images through an optimized features extraction approach[J]. Electronics, 2022, 11(17): 5616.
[31] WEI S X, CUI Q, YANG L, et al. RPC: a large-scale retail product checkout dataset[J]. arXiv:1901.07249, 2019.
[32] JIA M L, SHI M Y, SIROTENKO S, et al. FashionPedia: ontology, segmentation, and an attribute localization dataset[C]//Proceedings of the 16th European Conference on Computer Vision. Cham: Springer, 2020: 316-332.
[33] KHAN SD, ULLAH H. A survey of advances in vision-based vehicle re-identification[J]. Computer Vision and Image Understanding, 2019, 182(1): 50-63.
[34] YIN J H, WU A C, ZHENG W S. Fine-grained person re-identification[J]. International Journal of Computer Vision, 2020, 128(6): 1654-1672.
[35] GUO M H, XU T X, LIU J J, et al. Attention mechanisms in computer vision: a survey[J]. Computational Visual Media, 2022, 8(3): 331-368.
[36] BERA A, WHARTON Z, LIU Y H, et al. SR-GNN: spatial relation-aware graph neural network for fine-grained image categorization[J]. IEEE Transactions on Image Processing, 2022, 31(1): 6017-6031.
[37] LIU H, ZHANG C, XIE B C, et al. Affinity relation-aware fine-grained bird image recognition for robot vision tracking via transformers[C]//Proceedings of the 2022 IEEE International Conference on Robotics and Biomimetics, 2022: 662-667.
[38] 向旭宇, 刘亚捷, 曾彬等. 基于Transformer双线性网络的细粒度图像分类方法[J]. 华中科技大学学报 (自然科学版), 2024, 52(2): 84-89.
XIANG X Y, LIU Y J, ZENG B, et al. Fine grained image classification network based on Transformer bilinear network[J]. Journal of Huazhong University of Science and Technology (Natural Science Edition), 2024, 52(2): 84-89.
[39] 田战胜, 刘立波. 基于改进Transformer的细粒度图像分类模型[J]. 激光与光电子学进展, 2023, 60(2): 171-178.
TIAN Z S, LIU L B. Fine-grained image classification model based on improved Transformer[J]. Laser & Optoelectronics Progress, 2023, 60(2): 171-178.
[40] ZHANG Z C, CHEN Z D, WANG Y X, et al. ViT-FOD: a vision transformer based fine-grained object discriminator[J]. arXiv:2203.12816, 2022.
[41] WANG Y, YE S, YU S J et al. R2-Trans: fine-grained visual categorization with redundancy reduction[J]. arXiv:2204. 10095, 2022.
[42] 张天魁, 蔡昌利, 骆晓亮, 等. 基于多尺度特征Transformer的细粒度图像分类方法[J]. 北京邮电大学学报, 2023, 46(4): 70-75.
ZAHNG T K, CAI C L, LUO X L, et al. Multi-scale feature transformer based fine-grained image classification method[J]. Journal of Beijing University of Posts and Telecommunications, 2023, 46(4): 70-75.
[43] 陆妍, 王阳萍, 王文润. 基于Transformer的小样本细粒度图像分类方法[J]. 计算机工程与应用, 2023, 59(23): 219-227.
LU Y, WANG Y P, WANG W R. Transformer-based few-shot and fine-grained image classification method[J]. Computer Engineering and Applications, 2023, 59(23): 219-227.
[44] XU Q, WANG J H, JIANG B, et al. Fine-grained visual classification via internal ensemble learning Transformer[J]. IEEE Transactions on Multimedia, 2023, 25: 9015-9028.
[45] DEMIDOV D, SHARIF M H, ABDURAHIMOV A, et al. Salient mask-guided vision transformer for fine-grained classification[J]. arXiv:2305.07102, 2023.
[46] ZGAO Y F, LI J, CHEN X W, et al. Part-guided relational transformers for fine-grained visual recognition[J]. IEEE Transactions on Image Processing, 2021, 30(1): 9470-9481.
[47] KIM S, NAM J, KO B C. ViT-NeT: interpretable vision transformers with neural tree decoder[C]//Proceedings of the 39th International Conference on Machine Learning, 2022: 11162-11172.
[48] LIU H, ZHANG C, DENG Y J, et al. TransIFC: invariant cues-aware feature concentration learning for efficient fine-grained bird image classification[J]. IEEE Transactions on Multimedia, 2023. DOI:10.1109/TMM.2023.3238548.
[49] WANG H, LI Y Y, LUO H C. Semantic feature integration network for fine-grained visual classification[J]. arXiv: 2302.10275, 2023.
[50] 李佳盈, 蒋文婷, 杨林, 等. 基于ViT的细粒度图像分类[J]. 计算机工程与设计, 2023, 44(3): 916-921.
LI J Y, JIANG W T, YANG L, et al. Fine-grained visual classification based on vision transformer[J]. Computer Engineering and Design, 2023, 44(3): 916-921.
[51] WANG Q, WANG J J, DENG H Y, et al. AA-Trans: core attention aggregating transformer with information entropy selector for fine-grained visual classification[J]. Pattern Recognition, 2023, 140: 109547.
[52] ZHU H W, KE W J, LI D, et al. Dual cross-attention learning for fine-grained visual categorization and object re-identification[C]//Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022: 4682-4692.
[53] SUN H B, HE X T, PENG Y X. SIM-Trans: structure information modeling transformer for fine-grained visual categorization[C]//Proceedings of the 30th ACM International Conference on Multimedia, New York, 2022: 5853-5861.
[54] MOON J H, LEE J K, LEE Y L, et al. M2Former: multi-scale patch selection for fine-grained visual recognition[J]. arXiv:2308.02161, 2023.
[55] TOUVRON H, CORD M, DOUZE M, et al. Training data-efficient image transformers & distillation through attention[C]//Proceedings of the 38th International Conference on Machine Learning, 2021: 10347-10357.
[56] HE J, CHEN J, LIU S, et al. TransFG: a transformer architecture for fine-grained recognition[C]//Proceedings of the 36th AAAI Conference on Artificial Intelligence. Menlo Park: AAAI, 2022: 1174-1182.
[57] CONDE M V, TURGUTLU K. Exploring vision transformers for fine-grained classification[J]. arXiv:2106.10587, 2021.
[58] DO T, TRAN H, TJIPUTRA E, et al. Fine-grained visual classification using self assessment classifier[J]. arXiv:2205. 10529, 2022.
[59] LYU Y L, JING L P, WANG J Q, et al. Siamese transformer with hierarchical concept embedding for fine-grained image recognition[J]. Science China: Information Sciences, 2023, 66(3): 132107.
[60] JI R Y, LI J Y, ZHANG L B, et al. Dual transformer with multi-grained assembly for fine-grained visual classification[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2023, 33(9): 5009-5021.
[61] BEHERA A, WHSRTON Z, HEWAGE P, et al. Context-aware attentional pooling (CAP) for fine-grained visual classification[C]//Proceedings of the 35th AAAI Conference on Artificial Intelligence, 2021: 929-937.
[62] SU T, YE S, SONG C Q, et al. Mask-ViT: an object mask embedding in vision transformer for fine-grained visual classification[C]//Proceedings of the 2022 IEEE International Conference on Image Processing, 2022: 1626-1630.
[63] WANG J, YU X H, GAO Y S. Feature fusion vision transformer for fine-grained visual categorization[C]//Proceedings of the 2021 British Machine Vision Conference, 2021.
[64] HU Y Q, JIN X, ZHANG Y, et al. RAMS-Trans: recurrent attention multi-scale transformer for fine-grained image recognition[C]//Proceedings of the 29th ACM International Conference on Multimedia. New York: ACM, 2021: 4239-4248.
[65] ZHANG Y, CAO J, ZHANG L, et al. A free lunch from ViT: adaptive attention multi-scale fusion transformer for fine-grained visual recognition[C]//Proceedings of the 2022 IEEE International Conference on Acoustics, Speech and Signal Processing, Singapore, 2022: 3234-3238.
[66] HU X B, ZHU S N, PENG T L. HAVT: hierarchical attention vision transformer for fine-grained visual classification[J]. Journal of Visual Communication and Image Representation, 2023, 91(C): 103755.
[67] YU Y, WANG J G. Hybrid granularities transformer for fine-grained image recognition[J]. Entropy, 2023, 25(4): 601.
[68] ZHENG Z W, ZHOU J X, GAN J H, et al. Fine-grained image classification based on cross-attention network[J]. International Journal on Semantic Web and Information Systems, 2022, 18(1): 1-18.
[69] LIU X D, WANG L L, HAN X G. Transformer with peak suppression and knowledge guidance for fine-grained image recognition[J]. Neurocomputing, 2022, 492: 137-149.
[70] CHOU P Y, LIN C H, KAO W C. A novel plug-in module for fine-grained visual classification[J]. arXiv:2202.03822, 2022.
[71] LV X Y, XIA H, LI N, et al. MFVT: multilevel feature fusion vision transformer and RAMix data augmentation for fine-grained visual categorization[J]. Electronics, 2022, 11(21): 3552.
[72] 项剑文, 陈泯融, 杨百冰. 结合Swin及多尺度特征融合的细粒度图像分类[J]. 计算机工程与应用, 2023, 59(20): 147-157.
XIANG J W, CHEN M R, YANG B B. Fine-grained image classification combining Swin and multi-scale feature fusion[J]. Computer Engineering and Applications, 2023, 59(20): 147-157.
[73] CHOU P Y, KAO Y Y, LIN C H. Fine-grained visual classification with high-temperature refinement and background suppression[J]. arXiv:2303.06442, 2023.
[74] 黄港, 郑元林, 廖开阳, 等. 互补注意多样性特征融合网络的细粒度分类[J]. 中国图象图形学报, 2023, 28(8): 2420-2431.
HUANG G, ZHENG Y L, LIAO K Y, et al. Mutual attention diversity feature fusion network-relevant fine-grained classification[J]. Journal of Image and Graphics, 2023, 28(8): 2420-2431.
[75] DIAO Q S, JIANG Y, WEN B, et al. MetaFormer: a unified meta framework for fine-grained recognition[J]. arXiv:2203.02751, 2022.
[76] 赵婷婷, 高欢, 常玉广, 等. 基于知识蒸馏与目标区域选取的细粒度图像分类方法[J]. 计算机应用研究, 2023, 40(9): 2863-2868.
ZHAO T T, GAO H, CHANG Y G, et al. Knowledge distillation and target regions selection based fine-grained classification method[J]. Application Research of Computers, 2023, 40(9): 2863-2868.
[77] YUAN L, CHEN Y P, WANG T, et al. Tokens-to-Token ViT: training vision transformers from scratch on ImageNet[C]//Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, Montreal, 2021: 538-547.
[78] CHU X X, TIAN Z, ZHANG B, et al. Conditional positional encodings for vision transformers[J]. arXiv:2102.10882, 2021.
[79] LIU Z, LIN Y T, CAO Y, et al. Swin transformer-hierarchical vision transformer using shifted windows[C]//Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, Montreal, 2021: 9992-10002.
[80] ARNAB A, DEHGHANI M, HEIGOLD G, et al. ViViT—a video vision transformer[C]//Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, Montreal, 2021: 6816-6826.
[81] RADFORD A, KIM J, HALLACY C, et al. Learning transferable visual models from natural language supervision[C]//Proceedings of the 38th International Conference on Machine Learning, 2021: 8748-8763.
[82] GAO P, GENG S J, ZHANG R R, et al. CLIP-Adapter: better vision-language models with feature adapters[J]. International Journal of Computer Vision, 2024, 132: 581-595.
[83] NILSBACK M E, ZISSERMAN A. Automated flower classification over a large number of classes[C]//Proceedings of the 2008 6th Indian Conference on Computer Vision, Graphics & Image Processing, Bhubaneswar, 2008: 722-729.
[84] MAJI S, RAHTU E, KANNALA J, et al. Fine-grained visual classification of aircraft[J]. arXiv:1306.5151, 2013.
[85] HORN G V, BRANSON S, FARRELL R, et al. Building a bird recognition app and largescale dataset with citizen scientists: the fine print in fine-grained dataset collection[C]//Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition, 2015: 595-604.
[86] HORN G V, AODHA O M, SONG Y, et al. The iNaturalist species classification and detection dataset[C]//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018: 8769-8778.
[87] MIN W Q, LIU L H, WANG Z L, et al. ISIA Food-500: a dataset for large-scale food recognition via stacked global-local attention network[C]//Proceedings of the 28th ACM International Conference on Multimedia, 2020: 393-401.
[88] HORN G V, COLE E, BEERY S, et al. Benchmarking representation learning for natural world image collections[C]//Proceedings of the 2021 IEEE Conference on Computer Vision and Pattern Recognition, 2021: 12884-12893.

Survey of Vision Transformer in Fine-Grained Image Classification

细粒度图像分类上Vision Transformer的发展综述

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics

[1]	WANG Cailing, YAN Jingjing, ZHANG Zhidong. Review on Human Action Recognition Methods Based on Multimodal Data [J]. Computer Engineering and Applications, 2024, 60(9): 1-18.
[2]	XU Hongjun, TANG Ziqiang, ZHANG Jindong, ZHU Peihua. Research on Optimization of YOLOv5s Detection Algorithm for Steel Surface Defect [J]. Computer Engineering and Applications, 2024, 60(7): 306-314.
[3]	XUAN Xi, HAN Runping, GAO Jingxin. Conformer-Based Speaker Recognition Model for Real-Time Multi-Scenarios [J]. Computer Engineering and Applications, 2024, 60(7): 147-156.
[4]	MA Yamei, WANG Shuangting, DU Weibing. Hyperspectral Image Classification Based on Double Branch Multidimensional Attention Feature Fusion [J]. Computer Engineering and Applications, 2024, 60(7): 192-203.
[5]	SU Jia, QIN Yichang, JIA Ze, WANG Jing. Small Object Detection Algorithm Based on ATO-YOLO [J]. Computer Engineering and Applications, 2024, 60(6): 68-77.
[6]	FENG Xinxin, GAO Shu. Hand Pose Estimation Based on Multi-Feature Enhancement [J]. Computer Engineering and Applications, 2024, 60(6): 207-213.
[7]	WANG Haiqun, WANG Bingnan, GE Chao. Re-Parameterized YOLOv8 Pavement Disease Detection Algorithm [J]. Computer Engineering and Applications, 2024, 60(5): 191-199.
[8]	CHEN Lei, XI Yimeng, LIU Libo. Survey on Video-Text Cross-Modal Retrieval [J]. Computer Engineering and Applications, 2024, 60(4): 1-20.
[9]	JIANG Wentao, WANG Deqiang, ZHANG Shengchong. Correlation Filtering Target Tracking Algorithm Based on Nonlinear Spatio-Temporal Regularization [J]. Computer Engineering and Applications, 2024, 60(3): 165-176.
[10]	TAN Guangpu, ZHU Guangli, WEI Siyu. Implicit Sentiment Classification Model Based on Enhancement of Sentiment Features Oriented to Chinese Text [J]. Computer Engineering and Applications, 2024, 60(3): 196-204.
[11]	ZHOU Yan, LIAO Junwei, LIU Xiangyu, ZHOU Yuexia, ZENG Fanzhi. Improved FCENet Algorithm for Natural Scene Text Detection [J]. Computer Engineering and Applications, 2024, 60(3): 228-236.
[12]	JIN Haibo, MA Linlin, TIAN Guiyuan. Single Image Defogging Method Under Adaptive Transformer Network [J]. Computer Engineering and Applications, 2024, 60(3): 237-245.
[13]	WANG Yizhong, HU Yaqi, WU Xiaosuo, YAN Haowen, WANG Xiaocheng. Semantic Segmentation Method for Remote Sensing Images Based on Improved Swin Transformer [J]. Computer Engineering and Applications, 2024, 60(11): 194-203.
[14]	CUI Shaoguo, DU Xiao, YANG Zetian. Neural Recommendation Algorithm Using Combinations of Low and High-Order Features Based on Multi-Attention Mechanism [J]. Computer Engineering and Applications, 2023, 59(8): 192-199.
[15]	SHI Lei, JI Qingyu, CHEN Qingwei, ZHAO Hengyi, ZHANG Junxing. Review of Research on Application of Vision Transformer in Medical Image Analysis [J]. Computer Engineering and Applications, 2023, 59(8): 41-55.