基于双Transformer结构的多模态视频段落描述生成研究
赵宏, 张立军
Research on Multi-Modal Video Paragraph Captioning Based on Dual-Transformer Structure
ZHAO Hong, ZHANG Lijun
计算机工程与应用 . 2025, (21): 182 -191 .  DOI: 10.3778/j.issn.1002-8331.2407-0330