1.School of Computer Science and Technology, East China Normal University, Shanghai 200062, China
2.School of Data Science and Engineering, East China Normal University, Shanghai 200062, China
[1] LOWE D G.Distinctive image features from scale-invariant keypoints[J].International Journal of Computer Vision,2004,60(2):91-110.
[2] ARANDJELOVI? R,ZISSERMAN A.Three things everyone should know to improve object retrieval[C]//2012 IEEE Conference on Computer Vision and Pattern Recognition,2012:2911-2918.
[3] JéGOU H,DOUZE M,SCHMID C,et al.Aggregating local descriptors into a compact image representation[C]//2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition,2010:3304-3311.
[4] PERRONNIN F,LIU Y,SáNCHEZ J,et al.Large-scale image retrieval with compressed fisher vectors[C]//2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition,2010:3384-3391.
[5] BABENKO A,SLESAREV A,CHIGORIN A,et al.Neural codes for image retrieval[C]//European Conference on Computer Vision.Cham:Springer,2014:584-599.
[6] BABENKO A,LEMPITSKY V.Aggregating local deep features for image retrieval[C]//Proceedings of the IEEE International Conference on Computer Vision,2015:1269-1277.
[7] KALANTIDIS Y,MELLINA C,OSINDERO S.Cross-dimensional weighting for aggregated deep convolutional features[C]//European Conference on Computer Vision.Cham:Springer,2016:685-701.
[8] TOLIAS G,SICRE R,JéGOU H.Particular object retrieval with integral max-pooling of CNN activations[C]//International Conference on Learning Representations,2016:1-12.
[9] RADENOVI? F,TOLIAS G,CHUM O.Fine-tuning CNN image retrieval with no human annotation[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2018,41(7):1655-1668.
[10] MIN W,MEI S,LI Z,et al.A two-stage triplet network training framework for image retrieval[J].IEEE Transactions on Multimedia,2020,22(12):3128-3138.
[11] DENG J,DONG W,SOCHER R,et al.Imagenet:a large-scale hierarchical image database[C]//2009 IEEE Conference on Computer Vision and Pattern Recognition,2009:248-255.
[12] GORDO A,ALMAZAN J,REVAUD J,et al.End-to-end learning of deep visual representations for image retrieval[J].International Journal of Computer Vision,2017,124(2):237-254.
[13] CHEN W,CHEN X,ZHANG J,et al.Beyond triplet loss:a deep quadruplet network for person re-identification[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2017:403-412.
[14] SOHN K.Improved deep metric learning with multi-class [n]-pair loss objective[C]//Proceedings of the 30th Interntional Conference on Neural Information Processing Systems,2016:1857-1865.
[15] SHEN C,ZHOU C,JIN Z,et al.Learning feature embedding with strong neural activations for fine-grained retrieval[C]//Proceedings of the on Thematic Workshops of ACM Multimedia,2017:424-432.
[16] JUN H J,KO B S,KIM Y,et al.Combination of multiple global descriptors for image retrieval[J].arXiv:1903.10663,2019.
[17] NOH H,ARAUJO A,SIM J,et al.Large-scale image retrieval with attentive deep local features[C]//Proceedings of the IEEE International Conference on Computer Vision,2017:3456-3465.
[18] NIE X,LU H,WANG Z,et al.Weakly supervised image retrieval via coarse-scale feature fusion and multi-level attention blocks[C]//Proceedings of the 2019 on International Conference on Multimedia Retrieval,2019:48-52.
[19] GU Y,LI C,XIE J.Attention-aware generalized mean pooling for image retrieval[J].arXiv:1811.00202,2018.
[20] WU X,IRIE G,HIRAMATSU K,et al.Weighted generalized mean pooling for deep image retrieval[C]//2018 25th IEEE International Conference on Image Processing(ICIP),2018:495-499.
[21] NG T,BALNTAS V,TIAN Y,et al.SOLAR:second-order loss and attention for image retrieval[C]//European Conference on Computer Vision.Cham:Springer,2020:253-270.
[22] RAMACHANDRAN P,PARMAR N,VASWANI A,et al.Stand-alone self-attention in vision models[J].arXiv:1906.
05909,2019.
[23] KIM W,GOYAL B,CHAWLA K,et al.Attention-based ensemble for deep metric learning[C]//Proceedings of the European Conference on Computer Vision,2018:736-751.
[24] OPITZ M,WALTNER G,POSSEGGER H,et al.Deep metric learning with bier:boosting independent embeddings robustly[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2018,42(2):276-290.
[25] KRAUSE J,STARK M,DENG J,et al.3d object representations for fine-grained categorization[C]//Proceedings of the IEEE International Conference on Computer Vision Workshops,2013:554-561.
[26] WAH C,BRANSON S,WELINDER P,et al.The caltech-ucsd birds-200-2011 dataset[R].California Institute of Technology,2011.
[27] SIMONYAN K,ZISSERMAN A.Very deep convolutional networks for large-scale image recognition[J].arXiv:1409.
1556,2014.
[28] HE K,ZHANG X,REN S,et al.Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2016:770-778.
[29] HERMANS A,BEYER L,LEIBE B.In defense of the triplet loss for person re-identification[J].arXiv:1703.07737,2017.
[30] ZHANG X,YU F X,KARAMAN S,et al.Heated-up softmax embedding[J].arXiv:1809.04157,2018.
[31] SZEGEDY C,VANHOUCKE V,IOFFE S,et al.Rethinking the inception architecture for computer vision[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2016:2818-2826.
[32] KINGMA D P,BA J.Adam:a method for stochastic optimization[J].arXiv:1412.6980,2014.
[33] WANG H,WANG Y,ZHOU Z,et al.Cosface:large margin cosine loss for deep face recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2018:5265-5274.
[34] ZHAI A,WU H Y.Classification is a strong baseline for deep metric learning[J].arXiv:1811.12649,2018.
[35] WANG X,HUA Y,KODIROV E,et al.Ranked list loss for deep metric learning[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,2019:5207-5216.