Computer Engineering and Applications ›› 2025, Vol. 61 ›› Issue (14): 353-361.DOI: 10.3778/j.issn.1002-8331.2403-0009

• Engineering and Applications • Previous Articles     Next Articles

Geographical Localization Algorithm for Photovoltaic Panels Based on Monocular Depth Estimation

NI Yuansong, HAN Jun, HU Guangyi, WANG Wenshuai   

  1. School of Communication and Information Engineering, Shanghai University, Shanghai 201900, China
  • Online:2025-07-15 Published:2025-07-15

基于单目深度估计的光伏板地理定位算法

倪源松,韩军,胡广怡,王文帅   

  1. 上海大学 通信与信息工程学院,上海 201900

Abstract: In the unmanned aerial vehicle (UAV) inspection of photovoltaic power plants, accurately locating the geographic positions of solar panels is crucial. Currently, most positioning methods rely on geographic information data, multi-view images, or LiDAR, but these methods struggle to rapidly locate the geographic positions of targets in unknown and complex environments. With the development of monocular depth estimation (MDE) using deep learning, MDE networks have demonstrated high depth prediction accuracy in road scenes. Based on this, a novel photovoltaic panel geographic localization algorithm is proposed, which utilizes an MDE network to estimate the target distance and transforms the pixel coordinates of solar panels to geographic coordinates based on the camera imaging model. To address the poor depth prediction accuracy of classical MDE in UAV scenes with variable shooting angles and long distances, a Swin transformer-based optimized MDE network (SwinDenseDepth) is designed for this scenario. This network enhances the depth perception capability for UAV scenes by employing an encoder composed of Swin-transformers and integrating dense connection structures with channel-wise spatial attention fusion modules, thereby improving the accuracy of depth estimation using semantic and contextual information. Experimental results show that compared to the current mainstream MDE methods, SwinDenseDepth can more accurately predict distances in UAV images and the positioning algorithm achieves a positioning error within the range of 1~2 meters in images captured at inspection heights of 30~60 meters, meeting the practical requirements of solar panel localization.

Key words: geographical target localization, monocular depth estimation, visual geolocation, unmanned aerial vehicle (UAV), photovoltaic panels component

摘要: 在光伏电站的无人机巡检中,准确定位光伏板的地理位置是关键。目前大多数定位方法依赖于地理信息数据、多视角图像或激光雷达,但这些方法难以在未知复杂环境下快速定位目标的地理位置。随着深度学习的单目深度估计(MDE)发展,MDE网络在道路场景下已展现出较高的深度预测精度。基于此,提出了一种全新的光伏板地理定位算法,采用MDE网络估算目标距离,并根据相机成像模型将光伏板的像素坐标转换至地理坐标。为了解决经典MDE在拍摄视角多变和距离较远的无人机场景下深度预测精度不佳,设计了针对该场景优化的MDE网络(SwinDenseDepth),采用由Swin Transformer组成的编码器增强对无人机场景的深度感知能力,并结合密集连接结构与通道空间注意力融合模块,利用语义和上下文信息提高深度估计的准确性。实验结果表明相比于目前主流的MDE能更为准确地预测无人机图像中距离,并且定位算法在巡检高度30~60 m的图像中定位误差在1~2 m范围内,满足定位光伏板的实际需求。

关键词: 目标地理定位, 单目深度估计, 视觉地理定位, 无人机(UAV), 光伏板组件