[1] AHMADILIVANI M H, TAHERI M, RAIK J, et al. A systematic literature review on hardware reliability assessment methods for deep neural networks[J]. ACM Computing Surveys, 2024, 56(6): 1-39.
[2] LAMPERT C H, NICKISCH H, HARMELING S. Learning to detect unseen object classes by between-class attribute transfer[C]//Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2009: 951-958.
[3] CHAO W L, CHANGPINYO S, GONG B Q, et al. An empirical study and analysis of generalized zero-shot learning for object recognition in the wild[C]//Proceedings of the 14th European Conference on Computer Vision. Cham: Springer, 2016: 52-68.
[4] 王泽深, 杨云, 向鸿鑫, 等. 零样本学习综述[J]. 计算机工程与应用, 2021, 57(19): 1-17.
WANG Z S, YANG Y, XIANG H X, et al. Survey on zero-shot learning[J]. Computer Engineering and Applications, 2021, 57(19): 1-17.
[5] XU J Z, DUAN S L, TANG C W, et al. Attribute localization and revision network for zero-shot learning[J]. arXiv:2310. 07548, 2023.
[6] CHEN S M, HONG Z M, HOU W J, et al. TransZero: cross attribute-guided transformer for zero-shot learning[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45(11): 12844-12861.
[7] LIU Y, ZHOU L, BAI X, et al. Goal-oriented gaze estimation for zero-shot learning[C]//Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 3793-3802.
[8] ALAMRI F, DUTTA A. Multi-head self-attention via vision transformer for zero-shot learning[J]. arXiv:2108.00045, 2021.
[9] CHEN S M, HONG Z M, XIE G S, et al. MSDN: mutually semantic distillation network for zero-shot learning[C]//Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2022: 7602-7611.
[10] CHEN Z, HUANG Y F, CHEN J Y, et al. DUET: cross-modal semantic grounding for contrastive zero-shot learning[C]//Proceedings of the AAAI Conference on Artificial Intelligence, 2023: 405-413.
[11] LEI Y, SHENG G S, LI F F, et al. High-discriminative attribute feature learning for generalized zero-shot learning[J]. arXiv:2404.04953, 2024.
[12] YAMADA I, ASAI A, SAKUMA J, et al. Wikipedia2Vec: an efficient toolkit for learning and visualizing the embeddings of words and entities from wikipedia[J]. arXiv:1812. 06280, 2018.
[13] MANCINI M, NAEEM M F, XIAN Y Q, et al. Learning graph embeddings for open world compositional zero-shot learning[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024, 46(3): 1545-1560.
[14] XU W J, XIAN Y Q, WANG J N, et al. VGSE: visually-grounded semantic embeddings for zero-shot learning[C]//Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2022: 9306-9315.
[15] BUJWID S, SULLIVAN J. Large-scale zero-shot image classification from rich and diverse textual descriptions[J]. arXiv:2103.09669, 2021.
[16] NAEEM M F, XIAN Y Q, VAN GOOL L, et al. I2DFormer: learning image to document attention for zero-shot image classification[J]. arXiv:2209.10304, 2022.
[17] SHUBHO F H, CHOWDHURY T F, CHERAGHIAN A, et al. ChatGPT-guided semantics for zero-shot learning[C]//Proceedings of the 2023 International Conference on Digital Image Computing: Techniques and Applications. Piscataway: IEEE, 2023: 418-425.
[18] NAEEM M F, ALI KHAN M G Z, XIAN Y Q, et al. I2MVFormer: large language model generated multi-view document supervision for zero-shot image classification[C]//Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2023: 15169-15179.
[19] MIN B N, ROSS H, SULEM E, et al. Recent advances in natural language processing via large pre-trained language models: a survey[J]. ACM Computing Surveys, 2021, 56: 1-40.
[20] OUYANG L, WU J, XU J, et al. Training language models to follow instructions with human feedback[J]. arXiv:2203. 02155, 2022.
[21] YANG Y, PANAGOPOULOU A, ZHOU S H, et al. Language in a bottle: language model guided concept bottlenecks for interpretable image classification[C]//Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2023: 19187-19197.
[22] PRATT S, COVERT I, LIU R, et al. What does a platypus look like? Generating customized prompts for zero-shot image classification[C]//Proceedings of the 2023 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2023: 15645-15655.
[23] SRIVASTAVA P, GANU T, GUHA S. Towards zero-shot and few-shot table question answering using GPT-3[J]. arXiv:2210.17284, 2022.
[24] BELTAGY I, PETERS M E, COHAN A. Longformer: the long-document transformer[J]. arXiv:2004.05150, 2020.
[25] QU X Y, YU J, GAI K K, et al. Visual-semantic decomposition and partial alignment for document-based zero-shot learning[C]//Proceedings of the 32nd ACM International Conference on Multimedia. New York: ACM, 2024: 4581-4590.
[26] XIAN Y Q, LAMPERT C H, SCHIELE B, et al. Zero-shot learning: a comprehensive evaluation of the good, the bad and the ugly[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019, 41(9): 2251-2265.
[27] WAH C, BRANSON S, WELINDER P, et al. Caltech-UCSD Birds-200-2011, CNS-TR-2010-001[R]. California Institute of Technology, 2010.
[28] PATTERSON G, HAYS J. SUN attribute database: discovering, annotating, and recognizing scene attributes[C]//Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2012: 2751-2758.
[29] RADFORD A, KIM J W, HALLACY C, et al. Clip: learning transferable visual models from natural language supervision[C]//Proceedings of the International Conference on Machine Learning, 2021: 8748-8763.
[30] LIU M, LI F, ZHANG C J, et al. Progressive semantic-visual mutual adaption for generalized zero-shot learning[C]//Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2023: 15337-15346.
[31] CHAO W L, CHANGPINYO S, GONG B Q, et al. An empirical study and analysis of generalized zero-shot learning for object recognition in the wild[C]//Proceedings of the 14th European Conference on Computer Vision. Cham: Springer, 2016: 52-68.
[32] DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16x16 words: transformers for image recognition at scale[J]. arXiv:2010.11929, 2020.
[33] ALAMRI F, DUTTA A. Implicit and explicit attention for zero-shot learning[J]. arXiv:2110.00860, 2021.
[34] CHEN Z, ZHANG P F, LI J J, et al. Zero-shot learning by harnessing adversarial samples[C]//Proceedings of the 31st ACM International Conference on Multimedia. New York: ACM, 2023: 4138-4146.
[35] SONG K T, TAN X, QIN T, et al. MPNet: masked and permuted pre-training for language understanding[J]. arXiv:2004.09297, 2020.
[36] 马瑶, 智敏, 殷雁君, 等. CNN和Transformer在细粒度图像识别中的应用综述[J]. 计算机工程与应用, 2022, 58(19): 53-63.
MA Y, ZHI M, YIN Y J, et al. Review of applications of CNN and transformer in fine-grained image recognition[J]. Computer Engineering and Applications, 2022, 58(19): 53-63. |