[1] SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[J]. arXiv:1409.1556, 2014.
[2] HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016: 770-778.
[3] SZEGEDY C, IOFFE S, VANHOUCKE V, et al. Inception-v4, inception-resnet and the impact of residual connections on learning[C]//Thirty-First AAAI Conference on Artificial Intelligence, 2017.
[4] ZHU X X , ANGUELOV D, RAMANAN D. Capturing long-tail distributions of object subcategories[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2014: 915-922.
[5] ROHRBACH M, STARK M, SCHIELE B. Evaluating knowledge transfer and zero-shot learning in a large-scale setting[C]//2011 IEEE Conference on Computer Vision and Pattern Recognition, 2011: 1641-1648.
[6] AKATA Z, PERRONNIN F, HARCHAOUI Z, et al. Label embedding for attribute-based classification[C]//2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013: 819-826.
[7] AKATA Z, REED S, WALTER D, et al. Evaluation of output embeddings for fine-grained image classification[C]//2015 IEEE Conference on Computer Vision and Pattern Recognition, 2015: 2927-2936.
[8] LAROCHELLE H, ERHAN D, BENGIO Y. Zero-data learning of new tasks[C]//Proceedings of the 23rd National Conference on Artificial Intelligence, 2008: 646-651.
[9] PALATUCCI M, POMERLEAU D, HINTON G E, et al. Zero-shot learning with semantic output codes[C]//Advances in Neural Information Processing Systems, 2009: 1410-1418.
[10] LAMPERT C H, NICKISCH H, HARMELING S. Learning to detect unseen object classes by between-class attribute transfer[C]//2009 IEEE Conference on Computer Vision and Pattern Recognition, 2009: 951-958.
[11] LAMPERT C H, NICKISCH H, HARMELING S. Attribute-based classification for zero-shot visual object categorization[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 36(3): 453-465.
[12] AKATA Z, PERRONNIN F, HARCHAOUI Z, et al. Label-embedding for image classification[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 38(7): 1425-1438.
[13] XIAN Y Q, AKATA Z SHARMA G, et al. Latent embeddings for zero-shot classification[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016: 69-77.
[14] ROMERA-PAREDES B, TORR P. An embarrassingly simple approach to zero-shot learning[C]//International Conference on Machine Learning, 2015: 2152-2161.
[15] VERMA V K, RAI P. A simple exponential family framework for zero-shot learning[C]//Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Cham: Springer, 2017: 792-808.
[16] ZHANG L, WANG P, LIU L Q, et al. Towards effective deep embedding for zero-shot learning[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2020, 30(9): 2843-2852.
[17] XIE G S, LIU L, JIN X B, et al. Attentive region embedding network for zero-shot learning[C]//2019 IEEE Conference on Computer Vision and Pattern Recognition, 2019: 9376-9385.
[18] XIE G S, LIU L, ZHU F, et al. Region graph embedding network for zero-shot learning[C]//Proceedings of 16th European Conference on Computer Vision, Glasgow, UK, August 23-28, 2020. [S.l.]: Springer International Publishing, 2020: 562-580.
[19] ZHU Y Z, XIE J W, TANG Z Q, et al. Semantic-guided multi-attention localization for zero-shot learning[C]//2019 Fifth Workshop on Energy Efficient Machine Learning and Cognitive Computing-NeurIPS Edition (EMC2-NIPS 2019), Held, 13 December 2019. Vancouver, British Columbia, Canada: IEEE, 2019.
[20] LIU Y, ZHOU L, BAI X, et al. Goal-oriented gaze estimation for zero-shot learning[C]//2021 IEEE Conference on Computer Vision and Pattern Recognition, 2021: 3794-3803.
[21] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]//2017 Advances in Neural Information Processing Systems, 2017: 5998-6008.
[22] CHEN S M, HONG Z M, LIU Y, et al. TransZero: attribute-guided transformer for zero-shot learning[C]//2022 Proceedings of the AAAI Conference on Artificial Intelligence, 2022: 330-338.
[23] DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16x16 words: transformers for image recognition at scale[C]//2021 International Conference on Learning Representations, 2021.
[24] ZHU X Z, SU W J, LU L W, et al. Deformable detr: deformable transformers for end-to-end object detection[C]//2021 International Conference on Learning Representations, 2021.
[25] TOUVRON H, CORD M, DOUZE M, et al. Training data-efficient image transformers & distillation through attention[C]//International Conference on Machine Learning, 2021: 10347-10357.
[26] LI Y J, SHAN C H, LI H J, et al. A capsule-unified framework of deep neural networks for graphical programming[J]. Soft Computing, 2021, 25: 3849-3871.
[27] WAH C, BRANSON S, WELINDER P, et al. Caltech-UCSD birds 200, CNS-TR-2010-001[R]. California Institute of Technology, 2010.
[28] XIAN Y Q, LAMPERT C H, SCHIELE B, et al. Zero-shot learning-a comprehensive evaluation of the good, the bad and the ugly[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019, 41(9): 2251-2265. |