计算机工程与应用 ›› 2025, Vol. 61 ›› Issue (6): 22-35.DOI: 10.3778/j.issn.1002-8331.2405-0328
王文举,唐邦,顾泽骅,王森
出版日期:
2025-03-15
发布日期:
2025-03-14
WANG Wenju, TANG Bang, GU Zehua, WANG Sen
Online:
2025-03-15
Published:
2025-03-14
摘要: 为解决经典的多视角三维重建方法难以重建复杂物体、重建效果不佳以及在高分辨率上的扩展等问题,深度学习方法被引入用以重建具有更高精度的三维模型。系统地总结归纳、分析和比较了使用深度学习方法的多视角三维重建算法,并按照显式几何和隐式几何两种几何表示方式对近几年的多视角三维重建算法进行了分类与梳理。重点介绍了目前具有较高重建精度的将隐式函数以及体渲染相结合的神经隐式三维重建算法,并分别定量、定性分析了该类部分算法在数据集上的结果;另外列举了常用数据集和评价指标,并对未来的研究趋势和发展方向进行了展望。
王文举, 唐邦, 顾泽骅, 王森. 深度学习的多视角三维重建技术综述[J]. 计算机工程与应用, 2025, 61(6): 22-35.
WANG Wenju, TANG Bang, GU Zehua, WANG Sen. Overview of Multi-View 3D Reconstruction Techniques in Deep Learning[J]. Computer Engineering and Applications, 2025, 61(6): 22-35.
[1] KAMRAN-PISHHESARI A, MONIRI-MORAD A, SATTARVAND J. Applications of 3D reconstruction in virtual reality-based teleoperation: a review in the mining industry[J]. Technologies, 2024, 12(3): 40. [2] HUANG T Y. Research on three-dimensional reconstruction[J]. Science and Technology of Engineering, Chemistry and Environmental Protection, 2024, 1(5): 1-4. [3] ZI Y, WANG Q, GAO Z J, et al. Research on the application of deep learning in medical image segmentation and 3D reconstruction[J]. Academic Journal of Science and Technology, 2024, 10(2): 8-12. [4] CALLET P, CALLET P. 3D reconstruction from 3D cultural heritage models[C]//Proceedings of the Roadmap in Digital Heritage Preservation on 3D Research Challenges in Cultural Heritage. New York: ACM, 2014: 135-142. [5] FURUKAWA Y, PONCE J. Accurate, dense, and robust multiview stereopsis[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2010, 32(8): 1362-1376. [6] SCH?NBERGER J L, ZHENG E L, FRAHM J M, et al. Pixelwise view selection for unstructured multi-view stereo[C]//Proceedings of the 14th European Conference on Computer Vision. Cham: Springer International Publishing, 2016: 501-518. [7] BROADHURST A, DRUMMOND T W, CIPOLLA R. A probabilistic framework for space carving[C]//Proceedings of the Eighth IEEE International Conference on Computer Vision. Piscataway: IEEE, 2001: 388-393. [8] SEITZ S M, DYER C R. Photorealistic scene reconstruction by voxel coloring[J]. International Journal of Computer Vision, 1999, 35(2): 151-173. [9] DELLAERT F, YEN-CHEN L. Neural volume rendering: NeRF and beyond[J]. arXiv:2101.05204, 2021. [10] GALLEGO G, DELBRüCK T, ORCHARD G, et al. Event-based vision: a survey[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44(1): 154-180. [11] KIM H, LEUTENEGGER S, DAVISON A J. Real-time 3D reconstruction and 6-DoF tracking with an event camera[C]//Proceedings of the 14th European Conference on Computer Vision. Cham: Springer International Publishing, 2016: 349-364. [12] CHEN H D, CHUNG V, TAN L, et al. Dense voxel 3D reconstruction using a monocular event camera[C]//Proceedings of the 2023 9th International Conference on Virtual Reality. Piscataway: IEEE, 2023: 30-35. [13] RUDNEV V, ELGHARIB M, THEOBALT C, et al. EventNeRF: neural radiance fields from a single colour event camera[C]//Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2023: 4992-5002. [14] LI Q X, WANG Z, JIE L P, et al. Dynamic wind turbine blade 3D model reconstruction with event camera[C]//Proceedings of the UNified Conference of DAMAS, InCoME and TEPEN Conferences (UNIfied 2023). Cham: Springer Nature Switzerland, 2024: 863-875. [15] WANG J X, HE J H, ZHANG Z Y, et al. Physical priors augmented event-based 3D reconstruction[J]. arXiv:2401. 17121, 2024. [16] KAZHDAN M, HOPPE H. Screened Poisson surface reconstruction[J]. ACM Transactions on Graphics, 2013, 32(3): 1-13. [17] EBNER T, FELDMANN I, RENAULT S, et al. Multi-view reconstruction of dynamic real-world objects and their integration in augmented and virtual reality applications[J]. Journal of the Society for Information Display, 2017, 25(3): 151-157. [18] YAO Y, LUO Z X, LI S W, et al. MVSNet: depth inference for unstructured multi-view stereo[C]//Proceedings of the 15th European Conference on Computer Vision. Cham: Springer International Publishing, 2018: 785-801. [19] KERBL B, KOPANAS G, LEIMKUEHLER T, et al. 3D Gaussian splatting for real-time radiance field rendering[J]. ACM Transactions on Graphics, 2023, 42(4): 1-14. [20] GUéDON A, LEPETIT V. SuGaR: surface-aligned Gaussian splatting for efficient 3D mesh reconstruction and high-quality mesh rendering[C]//Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2024: 5354-5363. [21] PROKOPETC K, DUPONT R. Towards dense 3D reconstruction for mixed reality in healthcare: classical multi-view stereo vs deep learning[C]//Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision Workshop. Piscataway: IEEE, 2019: 2061-2069. [22] LU Y J, WANG S, FAN S S, et al. Image-based 3D reconstruction for multi-scale civil and infrastructure projects: a review from 2012 to 2022 with new perspective from deep learning methods[J]. Advanced Engineering Informatics, 2024, 59: 102268. [23] CHOY C B, XU D F, GWAK J, et al. 3D-R2N2: a unified approach for single and multi-view 3D object reconstruction[C]//Proceedings of the 14th European Conference on Computer Vision. Cham: Springer International Publishing, 2016: 628-644. [24] WANG N Y, ZHANG Y D, LI Z W, et al. Pixel2Mesh: generating 3D mesh models from single RGB images[C]//Proceedings of the 15th European Conference on Computer Vision. Cham: Springer International Publishing, 2018: 55-71. [25] XIE H Z, YAO H X, SUN X S, et al. Pix2Vox: context-aware 3D reconstruction from single and multi-view images[C]//Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2019: 2690-2698. [26] Mildenhall B, Srinivasan P P, Tancik M, et al. NeRF: representing scenes as neural radiance fields for view synthesis[J]. Communications of the ACM, 2021, 65(1): 99-106. [27] WANG P, LIU L J, LIU Y, et al. NeuS: learning neural implicit surfaces by volume rendering for multi-view reconstruction[J]. arXiv:2106.10689, 2021. [28] YARIV L, GU J T, KASTEN Y, et al. Volume rendering of neural implicit surfaces[C]//Advances in Neural Information Processing Systems, 2021: 4805-4815. [29] FU Q, XU Q, ONG Y S, et al. Geo-NeuS: geometry-consistent neural implicit surfaces learning for multi-view reconstruction[C]//Advances in Neural Information Processing Systems, 2022: 3403-3416. [30] WANG Y M, HAN Q, HABERMANN M, et al. NeuS2: fast learning of neural implicit surfaces for multi-view reconstruction[C]//Proceedings of the 2023 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2023: 3272-3283. [31] LI Z, MüLLER T, EVANS A, et al. Neuralangelo: high-fidelity neural surface reconstruction[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2023: 8456-8465. [32] KAR A, H?NE C, MALIK J, et al. Learning a multi-view stereo machine[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems. New York: ACM, 2017: 364-375. [33] XIE H Z, YAO H X, ZHANG S P, et al. Pix2Vox++: multi-scale context-aware 3D object reconstruction from single and multiple images[J]. International Journal of Computer Vision, 2020, 128(12): 2919-2935. [34] FAN H Q, SU H, GUIBAS L. A point set generation network for 3D object reconstruction from a single image[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 2463-2471. [35] LIN C H, KONG C, LUCEY S. Learning efficient point cloud generation for dense 3D object reconstruction[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2018: 7114-7121. [36] KATO H, USHIKU Y, HARADA T. Neural 3D mesh renderer[C]//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 3907-3916. [37] WEN C, ZHANG Y D, LI Z W, et al. Pixel2Mesh++: multi-view 3D mesh generation via deformation[C]//Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2019: 1042-1051. [38] PENG S Y, NIEMEYER M, MESCHEDER L, et al. Convolutional occupancy networks[C]//Proceedings of the European Conference on Computer Vision. Cham: Springer International Publishing, 2020: 523-540. [39] PARK J J, FLORENCE P, STRAUB J, et al. DeepSDF: learning continuous signed distance functions for shape representation[C]//Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 165-174. [40] LIU S H, ZHANG Y D, PENG S Y, et al. DIST: rendering deep implicit signed distance function with differentiable sphere tracing[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 2016-2025. [41] YARIV L, KASTEN Y, MORAN D, et al. Multiview neural surface reconstruction by disentangling geometry and appearance[C]//Proceedings of the 34th International Conference on Neural Information Processing Systems. New York: ACM, 2020: 2492-2502. [42] KELLNHOFER P, JEBE L C, JONES A, et al. Neural lumigraph rendering[C]//Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 4285-4295. [43] SITZMANN V, MARTEL J N P, BERGMAN A W, et al. Implicit neural representations with periodic activation functions[C]//Proceedings of the 34th International Conference on Neural Information Processing Systems. New York: ACM, 2020: 7462-7473. [44] ZHANG J Y, YAO Y, LI S W, et al. Critical regularizations for neural surface reconstruction in the wild[C]//Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2022: 6260-6269. [45] FRIDOVICH-KEIL S, YU A, TANCIK M, et al. Plenoxels: radiance fields without neural networks[C]//Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2022: 5491-5500. [46] SUN C, SUN M, CHEN H T. Direct voxel grid optimization: super-fast convergence for radiance fields reconstruction[C]//Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2022: 5449-5459. [47] MüLLER T, EVANS A, SCHIED C, et al. Instant neural graphics primitives with a multiresolution hash encoding[J]. ACM Transactions on Graphics, 2022, 41(4): 1-15. [48] YU Z, PENG S, NIEMEYER M, et al. MonoSDF: exploring monocular geometric cues for neural implicit surface reconstruction[C]//Advances in Neural Information Processing Systems, 2022: 25018-25032. [49] CHEN D C, ZHANG P, FELDMANN I, et al. Recovering fine details for neural implicit surface reconstruction[C]//Proceedings of the 2023 IEEE/CVF Winter Conference on Applications of Computer Vision. Piscataway: IEEE, 2023: 4319-4328. [50] DARMON F, BASCLE B, DEVAUX J C, et al. Improving neural implicit surfaces geometry with patch warping[C]//Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2022: 6250-6259. [51] WU H Y, GRAIKOS A, SAMARAS D. S-VolSDF: sparse multi-view stereo regularization of neural implicit surfaces[C]//Proceedings of the 2023 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2023: 3533-3545. [52] WANG Y, SKOROKHODOV I, WONKA P. HF-NeuS: improved surface reconstruction using high?frequency details[C]//Advances in Neural Information Processing Systems, 2022: 1966-1978. [53] ZHANG Y Q, HU Z P, WU H Q, et al. Towards unbiased volume rendering of neural implicit surfaces with geometry priors[C]//Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2023: 4359-4368. [54] DOGARU A, ARDELEAN A T, IGNATYEV S, et al. Sphere-guided training of neural implicit surfaces[C]//Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2023: 20844-20853. [55] CAI B W, HUANG J C, JIA R F, et al. NeuDA: neural deformable anchor for high-fidelity implicit surface reconstruction[C]//Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2023: 8476-8485. [56] ROSU R A, BEHNKE S. PermutoSDF: fast multi-view reconstruction with implicit surfaces using permutohedral lattices[C]//Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2023: 8466-8475. [57] LI S X, LI C J, ZHU W B, et al. Instant-3D: instant neural radiance field training towards on-device AR/VR 3D reconstruction[C]//Proceedings of the 50th Annual International Symposium on Computer Architecture. New York: ACM, 2023: 1-13. [58] CROCE V, BILLI D, CAROTI G, et al. Comparative assessment of neural radiance fields and photogrammetry in digital heritage: impact of varying image conditions on 3D reconstruction[J]. Remote Sensing, 2024, 16(2): 301. [59] GE Y W, GUO B X, ZHA P S, et al. 3D reconstruction of ancient buildings using UAV images and neural radiation field with depth supervision[J]. Remote Sensing, 2024, 16(3): 473. [60] CHEN P C, GUNDERSON N M, LEWIS A, et al. Enabling rapid and high-quality 3D scene reconstruction in cystoscopy through neural radiance fields[C]//Proceedings of the Medical Imaging 2024: Image-Guided Procedures, Robotic Interventions, and Modeling, 2024: 56. [61] JENSEN R, DAHL A, VOGIATZIS G, et al. Large scale multi-view stereopsis evaluation[C]//Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2014: 406-413. [62] YAO Y, LUO Z X, LI S W, et al. BlendedMVS: a large-scale dataset for generalized multi-view stereo networks[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 1787-1796. [63] KNAPITSCH A, PARK J, ZHOU Q Y, et al. Tanks and Temples: benchmarking large-scale scene reconstruction[J]. ACM Transactions on Graphics, 2017, 36(4): 1-13. [64] SCH?PS T, SCH?NBERGER J L, GALLIANI S, et al. A multi-view stereo benchmark with high-resolution images and multi-camera videos[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 2538-2547. [65] MILDENHALL B, SRINIVASAN P P, ORTIZ-CAYON R, et al. Local light field fusion: practical view synthesis with prescriptive sampling guidelines[J]. ACM Transactions on Graphics, 2019, 38(4): 1-14. [66] DAI A, CHANG A X, SAVVA M, et al. ScanNet: richly-annotated 3D reconstructions of indoor scenes[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 2432-2443. [67] WANG Z, BOVIK A C, SHEIKH H R, et al. Image quality assessment: from error visibility to structural similarity[J]. IEEE Transactions on Image Processing, 2004, 13(4): 600-612. [68] HORE A, ZIOU D. Image quality metrics: PSNR vs. SSIM[C]//Proceedings of the 2010 20th International Conference on Pattern Recognition. Piscataway: IEEE, 2010: 2366-2369. [69] SARA U, AKTER M, UDDIN M S. Image quality assessment through FSIM, SSIM, MSE and PSNR: a comparative study[J]. Journal of Computer and Communications, 2019, 7(3): 8-18. [70] SCH?NBERGER J L, FRAHM J M. Structure-from-motion revisited[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 4104-4113. |
[1] | 刘红玉, 高见. 融合CBAM的违法犯罪类安卓恶意软件检测与分类模型研究[J]. 计算机工程与应用, 2025, 61(6): 317-327. |
[2] | 孙宇, 刘川, 周扬. 深度学习在知识图谱构建及推理中的应用[J]. 计算机工程与应用, 2025, 61(6): 36-52. |
[3] | 王敬凯, 秦董洪, 白凤波, 李路路, 孔令儒, 徐晨. 语音识别与大语言模型融合技术研究综述[J]. 计算机工程与应用, 2025, 61(6): 53-63. |
[4] | 侯颖, 胡鑫, 赵瑞瑞, 张楠, 徐艳红, 马莉. 感兴趣区域YOLO_BFROI的扶梯乘客安全检测算法[J]. 计算机工程与应用, 2025, 61(6): 84-95. |
[5] | 李佳静, 李盛, 戴媛媛, 孟涛, 罗小清, 闫宏飞. 融合位置信息和交互注意力的方面级情感分析[J]. 计算机工程与应用, 2025, 61(6): 220-228. |
[6] | 洪书颖, 张东霖. 语义信息处理方式分类的车道线检测技术研究综述[J]. 计算机工程与应用, 2025, 61(5): 1-17. |
[7] | 张建伟, 陈旭, 王叔洋, 景永俊, 宋吉飞. 时空图神经网络在物联网中的应用综述[J]. 计算机工程与应用, 2025, 61(5): 43-54. |
[8] | 余城旭, 张宇来. 基于微调的深度学习后门防御研究[J]. 计算机工程与应用, 2025, 61(5): 155-164. |
[9] | 李小童, 马素芬, 生慧, 魏国辉, 李欣桐. 基于深度学习的肺部CT图像病灶区域分割研究综述[J]. 计算机工程与应用, 2025, 61(4): 25-42. |
[10] | 董甲东, 郭庆虎, 陈琳, 桑飞虎. 深度学习中单阶段金属表面缺陷检测算法优化综述[J]. 计算机工程与应用, 2025, 61(4): 72-89. |
[11] | 雷景生, 章志豪, 钱小鸿, 王巍然, 杨胜英. 改进YOLOX的轻量级多方向车牌检测算法[J]. 计算机工程与应用, 2025, 61(4): 230-240. |
[12] | 张锴, 贾涛. 结合知识图谱和小目标改进的RCNN电力杆塔部件识别方法[J]. 计算机工程与应用, 2025, 61(4): 299-309. |
[13] | 蒋悦晗, 陈俊杰, 李洪均. 基于骨骼图神经网络的人体行为识别综述[J]. 计算机工程与应用, 2025, 61(3): 34-47. |
[14] | 李泽慧, 张琳, 山显英. 三维卷积神经网络方法改进及其应用综述[J]. 计算机工程与应用, 2025, 61(3): 48-61. |
[15] | 李志媛, 刘祎, 张鹏程, 张丽媛, 任时磊, 芦婧, 桂志国. AWTV和高斯注意力引导的LDCT图像去噪网络[J]. 计算机工程与应用, 2025, 61(3): 253-263. |
阅读次数 | ||||||||||||||||||||||||||||||||||||||||||||||
全文 130
|
|
|||||||||||||||||||||||||||||||||||||||||||||
摘要 87
|
|
|||||||||||||||||||||||||||||||||||||||||||||