[1] SCHIFANELLA R, DE JUAN P, TETREAULT J, et al. Detecting sarcasm in multimodal social platforms[C]//Proceedings of the 24th ACM International Conference on Multimedia. New York: ACM, 2016: 1136-1145.
[2] CAI Y T, CAI H Y, WAN X J. Multi-modal sarcasm detection in twitter with hierarchical fusion model[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: ACL, 2019: 2506-2515.
[3] PAN H L, LIN Z, FU P, et al. Modeling intra and inter-modality incongruity for multi-modal sarcasm detection[C]//Findings of the Association for Computational Linguistics: EMNLP 2020. Stroudsburg: ACL, 2020: 1383-1392.
[4] ZHANG M, CHANG K, WU Y. Multi-modal semantic understanding with contrastive cross-modal feature alignment[C]//Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, 2024: 11934-11943.
[5] LIANG B, LOU C W, LI X, et al. Multi-modal sarcasm detection with interactive in-modal and cross-modal graphs[C]//Proceedings of the 29th ACM International Conference on Multimedia. New York: ACM, 2021: 4707-4715.
[6] LIANG B, LOU C, LI X, et al. Multi-modal sarcasm detection via cross-modal graph convolutional network[C]//Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022: 1767-1777.
[7] LIU H, WANG W, LI H. Towards multi-modal sarcasm detection via hierarchical congruity modeling with knowledge enhancement[C]//Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022: 4995-5006.
[8] YUE T, MAO R, WANG H, et al. KnowleNet: knowledge fusion network for multimodal sarcasm detection[J]. Information Fusion, 2023, 100: 101921.
[9] DEVLIN J, CHANG M W, LEE K, et al. BERT: pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of NAACL-HLT, 2019.
[10] HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 770-778.
[11] RADFORD A, KIM J W, HALLACY C, et al. Learning transferable visual models from natural language supervision[C]//Proceedings of the International Conference on Machine Learning, 2021: 8748-8763.
[12] WANG X Y, SUN X W, YANG T, et al. Building a bridge: a method for image-text sarcasm detection without pretraining on image-text data[C]//Proceedings of the First International Workshop on Natural Language Processing Beyond Text. Stroudsburg: ACL, 2020: 19-29.
[13] XU N, ZENG Z X, MAO W J. Reasoning with multimodal sarcastic tweets via modeling cross-modality contrast and semantic association[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: ACL, 2020: 3777-3786.
[14] 李丽, 李平. 基于交互图神经网络的方面级多模态情感分析[J]. 计算机应用研究, 2023, 40(12): 3683-3689.
LI L, LI P. Aspect-level multimodal sentiment analysis based on interaction graph neural network[J]. Application Research of Computers, 2023, 40(12): 3683-3689.
[15] 胡文彬, 陈龙, 黄贤波, 等. 融合交叉注意力的突发事件多模态中文反讽识别模型[J]. 智能系统学报, 2024, 19(2): 392-400.
HU W B, CHEN L, HUANG X B, et al. A multimodal Chinese sarcasm detection model for emergencies based on cross attention[J]. CAAI Transactions on Intelligent Systems, 2024, 19(2): 392-400.
[16] 林洁霞, 朱小栋. CMHICL: 基于跨模态分层交互网络和对比学习的多模态讽刺检测[J]. 计算机应用研究, 2024, 41(9): 2620-2627.
LIN J X, ZHU X D. CMHICL: multi-modal sarcasm detection with cross-modal hierarchical interaction network and contrastive learning[J]. Application Research of Computers, 2024, 41(9): 2620-2627.
[17] HE K M, FAN H Q, WU Y X, et al. Momentum contrast for unsupervised visual representation learning[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 9726-9735.
[18] XIONG L, XIONG C, LI Y, et al. Approximate nearest neighbor negative contrastive learning for dense text retrieval[C]//Proceedings of the International Conference on Learning Representations, 2020.
[19] CHEN T, KORNBLITH S, NOROUZI M, et al. A simple framework for contrastive learning of visual representations[C]//Proceedings of the International Conference on Machine Learning, 2020: 1597-1607.
[20] ZHANG D J, NAN F, WEI X K, et al. Supporting clustering with contrastive learning[C]//Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg: ACL, 2021: 5419-5430.
[21] BHATTACHARJEE D, ZHANG T, SüSSTRUNK S, et al. MuIT: an end-to-end multitask learning transformer[C]//Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2022: 12021-12031.
[22] LI J, SELVARAJU R, GOTMARE A, et al. Align before fuse: vision and language representation learning with momentum distillation[C]//Advances in Neural Information Processing Systems, 2021: 9694-9705.
[23] ZHANG H, KOH J Y, BALDRIDGE J, et al. Cross-modal contrastive learning for text-to-image generation[C]//Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 833-842.
[24] LOSHCHILOV I, HUTTER F. Decoupled weight decay regularization[C]//Proceedings of the 7th International Conference on Learning Representations, 2019.
[25] DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16x16 words: transformers for image recognition at scale[C]//Proceedings of the 9th International Conference on Learning Representations, 2021.
[26] KIM Y. Convolutional neural networks for sentence classification[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: ACL, 2014: 1746-1751.
[27] XIONG T, ZHANG P R, ZHU H B, et al. Sarcasm detection with self-matching networks and low-rank bilinear pooling[C]//Proceedings of the World Wide Web Conference. New York: ACM, 2019: 2115-2124.
[28] LIU Y, OTT M, GOYAL N, et al. RoBERTa: a robustly optimized BERT pretraining approach[J]. arXiv:1907.11692, 2019.
[29] JIA M, XIE C, JING L. Debiasing multimodal sarcasm detection with contrastive learning[C]//Proceedings of the AAAI Conference on Artificial Intelligence, 2024. |