Stereo Matching Network Based on Edge-Guided Feature Fusion and Cost Aggregation

doi:10.3778/j.issn.1002-8331.2104-0025

Abstract

Abstract: Aiming at the problem of large errors in the fine structure of stereo matching, especially at the edges, a stereo matching algorithm using edge-guided feature fusion and cost aggregation is proposed. The edge of the image is used to guide the weighted fusion of features of different scales, that is, the edges of small-scale features and the non-edges of large-scale features are given more weight to obtain fused features with stronger characterization ability. In the cost aggregation stage, the matching cost at the edge is weakened and unreliable information dissemination is reduced. The proposed method is evaluated on the SceneFlow and KITTI 2015 datasets, and the errors of the benchmark network PSMNet are reduced by 35.2% and 2.2%, respectively. Experiments have proved that the introduction of edge information has specifically improved the disparity solution of existing algorithms at fine structures（especially at the edges）, and improves the overall prediction accuracy. In addition, the mentioned module is lightweight and can be applied to different stereo matching networks.

Key words: machine vision, stereo matching, convolutional neural network, binocular vision, edge information

摘要： 针对立体匹配在精细结构，尤其边缘处的误差较大的问题，提出了利用边缘引导特征融合和代价聚合的立体匹配算法。利用图像边缘引导不同尺度特征体加权融合，即对小尺度特征体的边缘处，大尺度特征体的非边缘处赋予更大权重，以获得表征能力更强的融合特征体。在代价聚合阶段弱化边缘处匹配代价，减少不可靠信息传播。所提方法在SceneFlow和KITTI 2015数据集进行了评估，将基准网络PSMNet的误差分别降低了35.2%、2.2%。实验证明，边缘信息的引入针对性地改善了现有算法在精细结构处（尤其边缘处）的视差求解，提高了整体预测精度。此外，所提的模块是轻量的，可适用于不同的立体匹配网络。

关键词: 机器视觉, 立体匹配, 卷积神经网络, 双目视觉, 边缘信息

ZHANG Haodong, SONG Jiafei, ZHANG Guanghui. Stereo Matching Network Based on Edge-Guided Feature Fusion and Cost Aggregation[J]. Computer Engineering and Applications, 2022, 58(21): 182-188.

张浩东, 宋嘉菲, 张广慧. 边缘引导特征融合和代价聚合的立体匹配算法[J]. 计算机工程与应用, 2022, 58(21): 182-188.

References

[1] KHAMIS S，FANELLO S R，RHEMANN C，et al.StereoNet：guided hierarchical refinement for real-time edge-aware depth prediction[C]//Proceedings of the European Conference on Computer Vision（ECCV），Sep 8-14，2018，Munich，Germany.Berlin：Springer Press，2018：573-590.
[2] ZHANG F，PRISACARIU V，YANG R，et al.Ga-net：guided aggregation net for end-to-end stereo matching[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition（CVPR），June 15-20，2019，Long Beach，CA，USA.New York：IEEE Press，2019：185-194.
[3] XU H，ZHANG J.AANet：adaptive aggregation network for efficient stereo matching[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition（CVPR），June 13-19，2020，Seattle，WA，USA.New York：IEEE Press，2020：1959-1968.
[4] ?BONTAR J，LECUN Y.Stereo matching by training a convolutional neural network to compare image patches[J].The Journal of Machine Learning Research，2016，17（1）：2287-2318.
[5] 马利，李晶皎，马技.邻域相关信息的改进Census变换立体匹配算法[J].计算机工程与应用，2014，50（24）：16-20.
MA L，LI J J，MA J.Modified census transform with related information of neighborhood for stereo matching algorithm[J].Computer Engineering and Applications，2014，50（24）：16-20.
[6] MAYER N，ILG E，HAUSSER P，et al.A large dataset to train convolutional networks for disparity，optical flow，and scene flow estimation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition（CVPR），June 27-30，2016，Las Vegas，NV，USA.New York：IEEE Press，2016：4040-4048.
[7] DOSOVITSKIY A，FISCHER P，ILG E，et al.Flownet：learning optical flow with convolutional networks[C]//Proceedings of the IEEE International Conference on Computer Vision（ICCV），Dec 7-13，2015，Santiago，Chile.New York：IEEE Press，2015：2758-2766.
[8] CHANG J R，CHEN Y S.Pyramid stereo matching network[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition（CVPR），June 18-23，2018，Salt Lake City，UT，USA.New York：IEEE Press，2018：5410-5418.
[9] HE K，ZHANG X，REN S，et al.Spatial pyramid pooling in deep convolutional networks for visual recognition[J].IEEE Transactions on Pattern Analysis and Machine Intelligence，2015，37（9）：1904-1916.
[10] HE K，ZHANG X，REN S，et al.Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition（CVPR），June 27-30，2016，Las Vegas，NV，USA.New York：IEEE Press，2016：770-778.
[11] ZHAO H，SHI J，QI X，et al.Pyramid scene parsing network[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition（CVPR），July 21-26，2017，Honolulu，HI，USA.New York：IEEE Press，2017：2881-2890.
[12] SONG X，ZHAO X，FANG L，et al.EdgeStereo：an effective multi-task learning network for stereo matching and edge detection[J].International Journal of Computer Vision，2020，128（5）：1-21.
[13] YANG G，ZHAO H，SHI J，et al.Segstereo：exploiting semantic information for disparity estimation[C]//Proceedings of the European Conference on Computer Vision（ECCV），Sep 8-14，2018，Munich，Germany.Berlin：Springer Press，2018：636-651.
[14] GUO X，YANG K，YANG W，et al.Group-wise correlation stereo network[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition（CVPR），June 15-20，2019，Long Beach，CA，USA.New York：IEEE Press，2019：3273-3282.
[15] KENDALL A，MARTIROSYAN H，DASGUPTA S，et al.End-to-end learning of geometry and context for deep stereo regression[C]//Proceedings of the IEEE International Conference on Computer Vision（CVPR），July 21-26，2017，Honolulu，HI，USA.New York：IEEE Press，2017：66-75.
[16] XIE S，TU Z.Holistically-nested edge detection[C]//Proceedings of the IEEE International Conference on Computer Vision（ICCV），Dec 7-13，2015，Santiago，Chile.New York：IEEE Press，2015：1395-1403.
[17] LIU Y，CHENG M M，HU X，et al.Richer convolutional features for edge detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition（CVPR），July 21-26，2017，Honolulu，HI，USA.New York：IEEE Press，2017：3000-3009.
[18] MENZE M，GEIGER A.Object scene flow for autonomous vehicles[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition（CVPR），June 7-12，2015，Boston，MA，USA.New York：IEEE Press，2015：3061-3070.