计算机工程与应用 ›› 2025, Vol. 61 ›› Issue (5): 233-240.DOI: 10.3778/j.issn.1002-8331.2310-0034

• 图形图像处理 • 上一篇    下一篇

改进并行双分支结构的实时性语义分割算法研究

苗思琦,杜煜,严超,徐成,孙慧荟   

  1. 1.北京联合大学 北京市信息服务工程重点实验室,北京 100101
    2.北京联合大学 机器人学院,北京 100101
  • 出版日期:2025-03-01 发布日期:2025-03-01

Study of Real-Time Semantic Segmentation Algorithms with Improved Parallel Two-Branch Structure

MIAO Siqi, DU Yu, YAN Chao, XU Cheng, SUN Huihui   

  1. 1.Beijing Key Laboratory of Information Service Engineering, Beijing Union University, Beijing 100101, China
    2.College of Robotics, Beijing Union University, Beijing 100101, China
  • Online:2025-03-01 Published:2025-03-01

摘要: 实时性语义分割由于其轻量化的网络和较快的推理速度在智能驾驶的道路场景中具有重要的应用价值。为解决道路场景中小目标信息丢失和细节被上下文淹没问题,提出了并行双分支结构的DDRPNet模型。设计了PAPPM模块融合不同尺度的语义边缘特性,增强对边界信息的建模能力。在低分辨率分支的1/16、1/32和1/64分辨率特征图后加入坐标注意力机制,以捕获不同尺度下的位置信息和通道信息,填补小目标信息丢失问题。算法在Cityscapes数据集上以46.3 FPS的实时性表现达到了mIoU为76.28%的准确性;在CamVid数据集以95.2 FPS的实时性表现达到了mIoU为73.2%的准确性。实验结果表明,该模型在精度和速度上达到良好平衡,语义分割性能显著提升,在智能驾驶领域有潜在应用前景。

关键词: 实时性语义分割, 双分支结构, 坐标注意力机制, 智能驾驶

Abstract: In order to solve the problem of losing small target information and details being flooded by context in road scenes, a DDRPNet model with parallel two-branch structure is proposed. The proposed DDRPNet has two noticeable features. Firstly, the PAPPM module is introduced to fuse the semantic edge features at different scales. Secondly, a coordinate attention mechanism is added after the 1/16, 1/32 and 1/64 resolution feature maps of the low-resolution branch to capture the position and channel information at different scales and fill the small target information loss problem. This paper verifies the efficacy of the proposed DDRPNet on the Cityscapes dataset, and the proposed model reaches 76.28% average intersection and merger ratio with 46.3 FPS speed. On the CamVid dataset, the proposed model reaches 73.2% average intersection and merger ratio with 95.2 FPS speed. The model achieves a good balance between accuracy and speed, and the semantic segmentation performance is significantly improved, which has potential applications in the field of intelligent driving.

Key words: real-time semantic segmentation, two-branch structured, coordinate attention mechanism, intelligent driving