Content of Graphics and Image Processing in our journal

        Published in last 1 year |  In last 2 years |  In last 3 years |  All
    Please wait a minute...
    For Selected: Toggle Thumbnails
    Improved Algorithm of FCOS for Complex Scene Mask Wear Detection
    WEI Chiyu, LIU Rong, LIU Ming, ZHANG Xinyue
    Computer Engineering and Applications    2023, 59 (11): 188-194.   DOI: 10.3778/j.issn.1002-8331.2203-0048
    Abstract14)      PDF(pc) (763KB)(2)       Save
    Aiming at the problems of multi-scale, multi angle and occlusion in mask wearing detection in complex scenes, a mask wearing detection algorithm in complex scenes based on improved FCOS is proposed in this paper. Firstly, in order to improve the feature extraction ability of the network for masks with different scales, the packet residual connection structure of Res2Net is introduced into the backbone network of the algorithm, and the deformable convolution is integrated to expand its modeling ability for objects with unknown shapes. Then, a feature pyramid integrating attention mechanism is designed to give different weights to feature channels and suppress useless feature information. Finally, according to the relevant statistical characteristics of the target mask, the positive and negative samples are automatically divided to improve the sample quality of masks with different scales, and Generalized Focal Loss is introduced to jointly train the classification score and positioning quality score of samples so as to improve the performance of the algorithm. The experimental results show that the mAP of improved algorithm in this paper improves 6.7 percentage points compared with the original FCOS in the detection of mask wearing in complex scenes. Meanwhile, compared with some mainstream target detection algorithms, the improved algorithm in this paper also has better effect and robustness.
    Reference | Related Articles | Metrics
    Improved Leaf Image Recognition of Disease in Multi-Scale Residual Networks
    ZHOU Mengran, YAO Xu
    Computer Engineering and Applications    2023, 59 (11): 195-202.   DOI: 10.3778/j.issn.1002-8331.2210-0310
    Abstract23)      PDF(pc) (661KB)(9)       Save
    To solve the problems of large number of parameters , low recognition accuracy and slow training speed on the crop disease image recognition model, a multi-scale convolutional network leaf disease image recognition model with attention module is proposed. Based on the residual network module, multi-scale convolution replaces the traditional single-scale convolution, so that the network is widened to obtain more feature information, avoid overfitting caused by network stacking too deep. In order to speed up the model training, depthwise separable convolution is used instead of tradition convolution to reduce the number of model parameters, the attention mechanism is introduced into the residual network to enhance the extraction ability of key feature information of the model, thus improving the recognition accuracy of the model. Through the comparative test of the experimental data set, the recognition accuracy of the improved network model reaches 99.48% and the parameters is only 19.06?MB, the experimental results show that the proposed method can effectively improve the recognition performance of the model and reduce the parameters, which lays a foundation for the realization of low-cost terminal operation.
    Reference | Related Articles | Metrics
    Improved Helmet Wear Detection Algorithm for YOLOv5
    QIAO Yan, ZHEN Tong, LI Zhihui
    Computer Engineering and Applications    2023, 59 (11): 203-211.   DOI: 10.3778/j.issn.1002-8331.2212-0190
    Abstract22)      PDF(pc) (857KB)(11)       Save
    Aiming at the problems of complex structure, large computation and low detection accuracy of the current target detection model, the helmet wearing algorithm based on the improved YOLOv5 is proposed in industrial scenarios. Firstly, the light-weight network ShuffleNetv2 is introduced in the backbone network, and the Focus structure and ShuffleNetv2 are retained to jointly form the backbone network to reduce the computation and number of parameters of the network; secondly, the Swin Transformer Block is made to be introduced in the C3 module to obtain the C3STB module, replacing the original C3 module in the Neck part; finally, the CBAM_ H attention mechanism is designed and embedded in the Neck network to obtain global context information and improve the model detection accuracy. The experimental results show that the improved YOLOv5 model compresses the number of parameters from 6.14×106 to 8.9×105, the computational volume from 1.64×1010 to 6.2×109, and the mAP from 0. 899 to 0. 908, which is better than the performance of the original model.
    Reference | Related Articles | Metrics
    Citrus Detection Method Based on Improved YOLOv5 Lightweight Network
    GAO Xinyang, WEI Sheng, WEN Zhiqing, YU Tianbiao
    Computer Engineering and Applications    2023, 59 (11): 222-230.   DOI: 10.3778/j.issn.1002-8331.2212-0023
    Abstract19)      PDF(pc) (941KB)(8)       Save
    Aiming at the problems of the existing citrus detection algorithms, such as low accuracy, large amount of model parameters, poor real-time detection, and unsuitability for mobile picking equipment, a citrus detection method based on the improved lightweight model YOLO-DoC is proposed. This paper introduces the ShuffleNetV2 network of the Bottleneck structure as the YOLOv5 backbone network model to construct a lightweight network. At the same time, the non-parametric SimAM attention mechanism is added to improve the recognition accuracy of targets in complex environments. In order to improve the positioning accuracy of the bounding box of the target fruit by the detection network, the bounding box of the target is obtained by introducing the method of Alpha-IoU bounding box regression loss function. Experiments show that the P(precision) value and mAP(mean average precision) value of the YOLO-DoC model are 98.8% and 99.1%, respectively, and the number of parameters is reduced to 1/7 that of the YOLOv5 network, and the size of the model is 2.8?MB. Compared with the original network model, the improved model has the advantages of fast recognition speed, high positioning accuracy and less memory usage. It can improve the picking efficiency under the premise of meeting the requirements of precise picking work.
    Reference | Related Articles | Metrics
    Colorectal Polyp Segmentation Combining Pyramid Vision Transformer and Axial Attention
    ZHOU Xue, BAI Zhengyao, LU Qianjie, FAN Shenglan
    Computer Engineering and Applications    2023, 59 (11): 222-230.   DOI: 10.3778/j.issn.1002-8331.2203-0110
    Abstract6)      PDF(pc) (626KB)(3)       Save
    To address the challenges of automatic and accurate segmentation of colorectal polyps, a colorectal segmentation network is proposed:PVTA-Net. The network consists of PVTv2, feature pyramid network(FPN), spatial pyramid(ASPP), multi-headed self-attentive mechanism(MHSA), and parallel axial attention module(PAA-d):extracting feature maps at different scales by PVTv2; using FPN different levels of features are fused to obtain the enhanced feature maps; ASPP is used to aggregate the feature maps obtained by FPN; MHSA is used to obtain the perceptual field containing all input images; and PAA-d is used to generate features with global relationships. Five datasets, including ColonDB, are used to test the comparison between PVTA-Net and mainstream polyp segmentation networks, and the results show that PVTA-Net outperforms the existing mainstream baseline networks. To verify the generalization performance of PVTA-Net, it is used for COVID-19 lung CT image segmentation, and the results show that, PVTA-Net outperforms the mainstream baseline network.
    Reference | Related Articles | Metrics
    High-Precision Garbage Detection Algorithm of Lightweight YOLOv5n
    TU Chengfeng, YI Anlin, YAO Tao, HE Wenwei
    Computer Engineering and Applications    2023, 59 (10): 187-195.   DOI: 10.3778/j.issn.1002-8331.2210-0316
    Abstract60)      PDF(pc) (712KB)(62)       Save
    Aiming at the problems of the existing domestic waste detection models, such as multiple parameters, large amount of calculation, which are not suitable for deployment on mobile devices or embedded devices, and less types of garbage identification, a lightweight and high-precision optimization research is carried out for YOLOv5n target detection algorithm. Firstly, the lightweight networks ShuffleNetv2 and GhostNet are introduced on the YOLOv5n architecture to accomplish the lightweight detection network design. Secondly, the attention mechanism SE is added to enhance the feature extraction ability of the network, and the response-based knowledge distillation algorithm is introduced to improve the accuracy of localization and classification, thereby improving the detection accuracy. Experimental results show that, on the HGI-30 dataset, the optimized YOLOv5n reduces the amount of parameters and computation by 22.3% and 23.3%, and the detection accuracy mAP0.5 and mAP0.5:0.95 are increased by 1.6 percentage points and 2.6 percentage points.
    Reference | Related Articles | Metrics
    Improved YOLOv5 Small Object Detection Algorithm in Moving Scenes
    ZHU Ruixin, YANG Fuxing
    Computer Engineering and Applications    2023, 59 (10): 196-203.   DOI: 10.3778/j.issn.1002-8331.2211-0017
    Abstract53)      PDF(pc) (645KB)(40)       Save
    For the problems of blurred images and low image quality due to the movement of devices and camera scattering in motion scenes, as well as the small size of the object, which make object detection difficult, an improved YOLOv5x object detection model in real time is proposed. Firstly, deformable convolutional network is used to replace part of the traditional convolution layer in the original YOLOv5x to enhance the model’s ability of fine-grained feature extraction and small object detection in motion scenes. Secondly, the SE attention mechanism is added to solve the problem of feature loss caused by the loss of global context information in the process of convolution, which improves the detection accuracy of small objects in the case of image blur. Finally, a new bounding box regression loss function, SIoU Loss, is introduced to solve the problem of random matching of prediction boxes in regression, improve the robustness and generalization ability of the model, and accelerate the convergence speed of the network. The experimental results show that compared with the YOLOv5x model, the improved algorithm is applied to underwater mobile robot biological detection, and the improved model accuracy [P,] recall rate [R] and average accuracy mAP are improved by 5.90 percentage points, 5.85 percentage points and 4.38 percentage points, respectively, which effectively enhances the detection performance of the small object detection model.
    Reference | Related Articles | Metrics
    Research on Brain-Inspired SNN for Underwater Target Classification of Sonar Images
    LIU Yang, TIAN Meng, CAO Kejing, WANG Ruiyi, ZHAO Wei
    Computer Engineering and Applications    2023, 59 (10): 204-212.   DOI: 10.3778/j.issn.1002-8331.2203-0003
    Abstract23)      PDF(pc) (748KB)(10)       Save
    Sonar images are widely used in underwater rescue and seabed exploration under complex sea conditions. Long-term manual search is very easy to cause visual fatigue and lost the target. Unmanned underwater vehicle can greatly reduce the workload and subjective error, but it depends on energy efficiency and automatic classification performance of unmanned autonomous system. The training and inference of convolutional neural network need high energy consumption, thus the conventional deep neural network is difficult to deploy and apply in the mobile environment of unmanned underwater vehicle, the scarcity of sonar image training data and imbalanced samples also increase the difficulty of model training. Spiking neural network can avoid the high multiplication cost in convolutional neural network through binary discrete timing spiking signal, and has the characteristics of low energy consumption and high-precision. In this paper, a low-energy shallow spiking neural network for synthetic aperture sonar image classification is constructed, and a small sample target classification algorithm based on spiking neural network is designed. The simulated sonar image generation method based on style transfer and weighted random sampling method are adopted to alleviate the problems of scarce sonar image training data and sample imbalance. Experiments show that when the sonar image samples are scarce and unbalanced, the classification accuracy of the algorithm is higher than that of convolutional neural networks based on ResNet50, VGG19 and MobileNet V2, up to 91.11%. The analysis of computational complexity and energy consumption shows that spiking neural network has great advantages over convolutional neural network. Spiking neural network is a very suitable model for the research and implementation of brain-inspired computing, and it can meet the requirements of mobile computing of unmanned underwater vehicles. This research has advanced technical advantages for the intelligent application of unmanned autonomous equipment.
    Reference | Related Articles | Metrics
    Lightweight Building Detection Model Based on YOLOv4 Optimization for Remote Sensing Images
    DING Fei, SHI Jie, WU Hongjie
    Computer Engineering and Applications    2023, 59 (10): 213-220.   DOI: 10.3778/j.issn.1002-8331.2202-0287
    Abstract36)      PDF(pc) (901KB)(28)       Save
    Aiming at the problems of low detection accuracy and large model size of existing building detection models, which lead to unbalanced speed and accuracy of remote sensing image detection and unfavorable to later deployment, a lightweight building detection model based on YOLOv4 optimization for remote sensing images is proposed. Firstly, GhostNet, a lightweight network, is used to replace CSP DarkNet53 for feature extraction. Secondly, the Dense-PANet feature fusion module is proposed by drawing on the idea of dense connection. Finally, the ECA attention mechanism is introduced into the Ghost module to replace the traditional convolution of the neck network. The experimental results show that the model proposed in this paper, compared with YOLOv4, sacrifices a small amount of detection speed, but increases the average precision by 0.96 percentage points, the recall by 1.08 percentage points, and decreases the model volume by 71.39%, the floating point of operations by 76.60%, which can effectively meet the demand of remote sensing image building detection.
    Reference | Related Articles | Metrics
    Superpixel Random Erasing for Long-Term Person Re-identification
    LI Guodong, GUO Lijun
    Computer Engineering and Applications    2023, 59 (10): 221-226.   DOI: 10.3778/j.issn.1002-8331.2203-0023
    Abstract26)      PDF(pc) (581KB)(19)       Save
    Existing person re-identification often relies on the assumption that a pedestrian will not change his clothing. Unfortunately, this assumption may not be applicable to long-term person re-identification. In datasets captured over a long period of time, the clothing of pedestrians changes frequently. The current mainstream person re-identification methods often fail in long-term person re-identification, and their recognition accuracy will drop significantly. Aiming at the situation of clothing changes in long-term personre-identification, a superpixel random erasing algorithm is proposed. The algorithm assigns random values to superpixel blocks that may be clothing regions for erasing. Moreover, the images before and after erasing will be input to the backbone for training. In addition, the model output features before and after erasing are also constrained with a deep mean squared error loss, which forces the model to learn cloth-irrelevant features. Experiments show that the proposed method can effectively cope with the problem of pedestrian clothing change in long-time person re-identification, and the recognition accuracy is greatly improved compared with previous methods.
    Reference | Related Articles | Metrics
    Improved FCOS Remote Sensing Image Detection Method Based on Distance Constraint
    SU Shuzhi, XIE Yuqi
    Computer Engineering and Applications    2023, 59 (10): 227-235.   DOI: 10.3778/j.issn.1002-8331.2205-0478
    Abstract28)      PDF(pc) (846KB)(15)       Save
    Remote sensing image object detection method based on deep learning is usually difficult to eliminate background interference in complex scenes, which leads to low detection accuracy. In order to solve this problem, this paper designs a feature pyramid structure based on scale stratification, and proposes a distance-constraints centerness(DCCN), thus forming an improved FCOS remote sensing image detection method based on distance constraint. The feature pyramid structure based on scale stratification includes high-level semantic information activation module and low-level effective feature perception module. The high-level semantic information module reconstructs the processing method of high-level feature map in the feature fusion stage, and improves the semantic perception ability of the top area of the feature pyramid. The low-level effective feature perception module enhances the information interaction ability between channels by introducing channel attention mechanism. DCCN can use the distance factor between the prediction box and the groundtruth box as the regression evaluation condition to improve the regression effect of the prediction box. In the experiment on NWPU VHR-10 dataset, the accuracy of this method reaches 92.6%, which is 4.9 percentage points higher than that of the original FCOS method, and effectively improves the accuracy of remote sensing image detection.
    Reference | Related Articles | Metrics
    Research on PCB Defect Detection Algorithm Based on YOLOX-WSC
    TUO Bing, HUANG Liwen, TANG Xin, CHEN Lieyong, ZHOU Jing
    Computer Engineering and Applications    2023, 59 (10): 236-243.   DOI: 10.3778/j.issn.1002-8331.2211-0143
    Abstract35)      PDF(pc) (711KB)(33)       Save
    In view of the difficulty and variety of PCB defect detection in complex scenes, and the problems of false or missed detection are easy to occur, a PCB defect detection algorithm based on YOLOX-WSC is proposed. Firstly, the input model data are optimized, and the weakening data are used to enhance the inaccurate image introduced by Mosaic, and the convergence is completed in advance to improve the model detection effect. Secondly, a parameterless attention SimAM is added to the backbone network to evaluate effective features using energy functions without adding model parameters, so as to improve the feature extraction and localization capability of the algorithm. Finally, the CSPLayer structure is replaced by CSPHB module in the feature fusion network to obtain higher-order semantic information, improve the resolution ability, and strengthen the feature fusion interaction ability of the feature fusion network, so as to improve the model detection performance. The experimental results show that the improved mean precision mAP of each module has been improved to different degrees, the mAP@0.5 of YOLOX-WSC algorithm reaches 96.65%, mAP@0.5:0.95 reaches 79.58%. Compared with YOLOX, the average accuracy of each defect category is significantly improved 2.88 percentage points and 11.64 percentage points, which proves the effectiveness of the algorithm.
    Reference | Related Articles | Metrics
    Improved Fabric Defect Detection Algorithm of YOLOv5
    MA Ahui, ZHU Shuangwu, LI Choudan, MA Xiaotong, WANG Shihao
    Computer Engineering and Applications    2023, 59 (10): 244-252.   DOI: 10.3778/j.issn.1002-8331.2209-0413
    Abstract39)      PDF(pc) (705KB)(34)       Save
    Aiming at the problems of slow detection speed of two-stage algorithm and low detection accuracy of one-stage algorithm in the current network model applied to fabric defect detection, an improved YOLOv5 fabric defect detection algorithm is proposed. Firstly, for the different sizes of fabric defects, the clustering distance standard of the [K]-mean algorithm is modified, and the size of the priori frame is recalculated. Secondly, the standard convolution(SC) of the network Neck layer is improved, and the depth separation convolution(DSC) is combined with the standard convolution to reduce the amount of network layer parameters while maintaining the feature extraction capability of the network. The coordinate attention(CA) mechanism is introduced in the feature fusion stage, so that the network can capture the connection between each channel while retaining the precise positioning information of the target, thereby enhancing the feature extraction and positioning capabilities of the network. Finally, the weighted bidirectional feature pyramid network(BiFPN) is used, the feature pyramid module is modified to achieve simple and fast multi-scale feature fusion. After training on the data set, the results show that the mAP value of the improved YOLOv5 model can reach 97.4%, which is 2.8 percentage points higher than the original network accuracy, which meets the requirements of fabric defect detection.
    Reference | Related Articles | Metrics
    Improved YOLOv5 for Remote Sensing Image Detection
    LIU Tao, DING Xueyan, ZHANG Bingbing, ZHANG Jianxin
    Computer Engineering and Applications    2023, 59 (10): 253-261.   DOI: 10.3778/j.issn.1002-8331.2212-0045
    Abstract43)      PDF(pc) (1198KB)(28)       Save
    Focusing on that YOLOv5 fails to take into account the issues of poor detection effects, false detection as well as omission caused by complex background information, small detection targets and low percentage of target semantic information in remote sensing image object detection, this paper proposes an improved YOLOv5 for remote sensing target detection. Firstly, a lightweight channel attention block is embedded to the C3 module of feature extraction and fusion module, aiming at enhancing the abilities of local feature extraction and fusion. Secondly, to enhance the multi-scale feature representation capability, a fine-level detection layer that fuses shallow semantic information is added, which helps to detect small targets. Finally, the Copy-Paste data augmentation is leveraged to enrich the diversity of training samples, which further solves the rate problem of high background information and low target area without introducing extra computation cost. Experimental results show that the improved YOLOv5 achieves 0.757 and 0.759 mAP values on the DOTA and DIOR datasets, respectively. It outperforms YOLOv5 by 0.017 and 0.059 gains, as well as obtains obvious accuracy improvements compared with other typical methods, demonstrating the effectiveness of the improved YOLOv5.
    Reference | Related Articles | Metrics
    Improved YOLOv5 Traffic Sign Detection Algorithm
    YANG Guoliang, YANG Hao, YU Shuaiying, WANG Jixiang, NIE Ziling
    Computer Engineering and Applications    2023, 59 (10): 262-269.   DOI: 10.3778/j.issn.1002-8331.2211-0359
    Abstract55)      PDF(pc) (582KB)(39)       Save
    Traffic sign detection has been widely used in intelligent transportation systems such as automatic driving and assisted driving, and its detection performance is related to driving safety. Aiming at the problem that the existing object detection algorithm has poor detection effect on traffic signs with small size, low resolution and non-obvious features in the image, a traffic sign detection algorithm based on improved YOLOv5s is proposed. The 80×80 small sensing field object detection layer in the original algorithm is changed to a smaller 160×160 detection layer, which improves the detection ability of the network model for small object of traffic signs and reduces the missed detection rate of small object. The attention context module(ACM) is constructed to obtain the characteristic information of the target and its adjacent regions from different receptive fields for each branch, and the attention mechanism is used to make the network pay more attention to the traffic signs in the image and avoid being affected by other complex information. The feature fusion module(FFM) is added to filter out the useless information on different layers and retain only the useful information for the model to detect traffic signs. The tacit knowledge is added to refine the output of the detection layer. Experimental results show that the improved algorithm has a recall rate and average accuracy of 95.2% and 97.2% on the CCTSDB traffic sign detection dataset, which is improved compared with the original model, and the effect is significantly improved under medium and long-distance small object detection, and the simultaneous detection speed is 47.3 FPS to meet the real-time requirements.
    Reference | Related Articles | Metrics
    Small Object Detection Algorithm Based on Improved YOLOv5 in UAV Image
    XIE Chunhui, WU Jinming, XU Huaiyu
    Computer Engineering and Applications    2023, 59 (9): 198-206.   DOI: 10.3778/j.issn.1002-8331.2212-0336
    Abstract110)      PDF(pc) (808KB)(69)       Save
    UAV aerial images have many characteristics, such as large-scale changes and complex backgrounds, so it is difficult for the existing detectors to detect small objects in aerial images. Aiming at the problem of mistake detection and omission, a small object detection algorithm model Drone-YOLO is proposed. A new detection branch is added to improve the detection capability at multiple scales, meanwhile the model contains a novel feature pyramid network with multi-level information aggregation, which realizes the fusion of cross-layers information. Then a feature fusion module based on multi-scale channel attention mechanism is designed to improve the focus on small objects. The classification task of the prediction head is decoupled from the regression task, and the loss function is optimized using Alpha-IoU to improve the accuracy of detection. The experimental results of VisDrone dataset show that the Drone-YOLO has improved the AP50 by 4.91?percentage points compared with the YOLOv5, and the inference time is only 16.78?ms. Compared with other mainstream models, it has a better detection effect for small targets, and can effectively complete the task of small target detection in UAV aerial images.
    Reference | Related Articles | Metrics
    Target Detection Algorithm of Remote Sensing Image Based on Improved YOLOv5
    LI Kunya, OU Ou, LIU Guangbin, YU Zefeng, LI Lin
    Computer Engineering and Applications    2023, 59 (9): 207-214.   DOI: 10.3778/j.issn.1002-8331.2209-0119
    Abstract109)      PDF(pc) (665KB)(80)       Save
    Aiming at the problems of low target detection accuracy caused by high background complexity, multiple target sizes and too many small targets in remote sensing images, this paper proposes a target detection algorithm of remote sensing image based on improved YOLOv5. The channel-global attention mechanism(CGAM) is introduced into the backbone network to enhance the feature extraction ability of targets at different scales and to suppress the interference of redundant information. The dense upsampling convolution(DUC) module is introduced to expand the low resolution convolution feature maps, which can effectively enhance the fusion effect of different convolution feature maps. The improved algorithm is applied to the open remote sensing data set RSOD, and the average accuracy AP value of the improved YOLOv5 algorithm reaches 78.5%, which is 3.1?percentage points higher than that of the original algorithm. Experimental results show that the improved algorithm can effectively improve the accuracy of remote sensing image target detection.
    Reference | Related Articles | Metrics
    FS-YOLOv5:Lightweight Infrared Rode Target Detection Method
    HUANG Lei, YANG Yuan, YANG Chengyu, YANG Wei, LI Yaohua
    Computer Engineering and Applications    2023, 59 (9): 215-224.   DOI: 10.3778/j.issn.1002-8331.2210-0487
    Abstract54)      PDF(pc) (815KB)(60)       Save
    In order to solve the problems of traditional target recognition algorithm in complex scene, including low precision, poor real-time performance and difficulty in small target detection, an FS-YOLOv5s lightweight model based on infrared scene is proposed. A new FS-MobileNetV3 network is proposed to extract feature images instead of CSPDarknet backbone network, which is based on YOLOv5s, a one-stage target detection network. Based on the CIOU loss function of the original network, a Power transform is introduced, which is replaced by α-CIoU to improve the detection ability of the network to small targets. Then K-means++ clustering algorithm is applied to the FLIR infrared data set to regenerate the Anchor. DIoU-NMS is used to replace the NMS post-processing method of the original network to improve the detection ability of occluded objects and reduce the missed detection rate of the model. The ablation experiments on the FLIR infrared dataset have verified that the FS-YOLOv5s lightweight algorithm can meet the task of road target detection in infrared scenes. Compared with the original network, the average accuracy of the FS-YOLOv5s model is only reduced by 0.37?percentage points. The size is reduced by 26%, the number of parameters is reduced by 29%, and the detection speed is increased by 11?FPS, which meets the needs of mobile deployment in different scenarios.
    Reference | Related Articles | Metrics
    Hybrid Samples Image Dehazing via Latent Space Translation
    ZHENG Yutong, SUN Haoying, SONG Wei
    Computer Engineering and Applications    2023, 59 (9): 225-236.   DOI: 10.3778/j.issn.1002-8331.2201-0105
    Abstract31)      PDF(pc) (1031KB)(13)       Save
    Deep learning learns the inherent laws of samples from datasets which determine the performance of the model to a certain extent. However, it may be lack of paired real data, or difficult to synthesize paired data to simulate the real environment to train in single image dehazing dataset. This problem may cause the trained model doesn’t perform well in real hazy image. This paper proposes hybrid samples learning problem, and the hybrid samples learning algorithm based on latent space translation, aiming to make full use of paired data and unpaired data(hybrid samples). VAE-GAN(variational auto-encoder, generative adversarial networks) is used to encode hybrid samples into latent space, and then the adversarial loss is used to align real data with synthesis data. The mixup of feature adaptive fusion(MFF) module included in mapping net is used to learn the translation between paired data. So that, a dehazing data path from the real hazy image to the clear image is established. The experimental results show that proposed model performs well in real hazy images compared with other algorithms, and has outstanding effect on thick hazy images, and the peak signal to noise ratio of the proposed algorithm is higher than that of comparison algorithms.
    Reference | Related Articles | Metrics
    3D Object Detection Method Combining on Graph Sampling and Graph Attention
    LI Wenju, CHU Wanghui, CUI Liu, SU Pan, ZHANG Gan
    Computer Engineering and Applications    2023, 59 (9): 237-244.   DOI: 10.3778/j.issn.1002-8331.2202-0075
    Abstract44)      PDF(pc) (726KB)(26)       Save
    For the task of 3D object detection in point clouds, there are objects have a small scale or appear in complex scenes, which makes them have a lower detection accuracy. Therefore, a 3D object detection method based on graph sampling and graph attention mechanism using the point clouds is proposed. Firstly, the method reduces the size of down-sampling voxels to maintain the point clouds density of small objects, and then introduces graph sampling technology to reduce the cost of constructing topological graphs in the point clouds for feature extraction. Finally, the embedded self-attention mechanism in graphs before and after the graph sampling is used to enhance the feature extraction ability of the network. Compared with the benchmark on the KITTI dataset, proposed method improves the detection accuracy of car in hard scenes by 1.96%, and improves the detection accuracy of pedestrian and cyclist in moderate scenes and hard scenes with 4.21% and 2.57% respectively. Besides, the training time of proposed method is reduced by 15%. These demonstrate superior performance in detection accuracy of small objects in point clouds and the sampling method can improve the training efficiency of the model.
    Reference | Related Articles | Metrics
    Multi-Branch Network Facial Expression Recognition Based on Gender Constraint
    ZENG Xi, XIN Yuelan, XIE Qiqi
    Computer Engineering and Applications    2023, 59 (9): 245-254.   DOI: 10.3778/j.issn.1002-8331.2202-0120
    Abstract30)      PDF(pc) (729KB)(18)       Save
    Aiming at the large intra-class variation and small inter-class differences of facial expressions under different genders, the thesis proposes a multi-branch network facial expression recognition based on gender constraints. Firstly, through the method of clustering algorithm K-means and convolutional neural network, the relationship between facial expression classes under gender constraints is obtained. Then, according to the obtained inter-class relationships, a backbone network and a branch network with a channel attention mechanism are constructed to further distinguish between strongly similar inter-class relationships and highlight intra-class changes in facial expressions of different genders. Finally, experiments and analysis are performed on CK+, FER2013 and RAF-DB datasets. Experiments show that the average recognition rate of the proposed network structure on the CK+, FER2013 and RAF-DB datasets is superior to other advanced methods, reaching 97.60%, 73.58% and 87.98%.
    Reference | Related Articles | Metrics
    Curb Segmentation Using Dual Branch and Feature Fusion Network
    SUN Yang, HAN Lei, WANG Chengqing, LI Yunpeng
    Computer Engineering and Applications    2023, 59 (9): 255-261.   DOI: 10.3778/j.issn.1002-8331.2202-0208
    Abstract37)      PDF(pc) (691KB)(30)       Save
    Curb detection is an important goal of intelligent vehicle environmental perception. This paper uses semantic segmentation to detect curbobject. Aiming at the problem that the semantic segmentation network cannot balance shallow features and deep features, a real-time curb segmentation network with dual-branch feature fusion is designed. The main branch of the network uses the residual structure module for downsampling, and reverts to the original resolution when the feature map resolution is 1/16 of the input resolution. Multiple modules are used to fuse shallow spatial features and high-level semantic features. The SDFE(spatial detail feature extraction)module is used to make up for the loss of geometric features. The joint feature pyramid(JFP)module is used to have strong semantics in multiple stages of the network. The multi-scale features of information are used in combination, the spatial feature attention mechanism(feature attention, FA)is designed in the branch, and 4 convolution normalization is used to enhance the extraction of spatial detail features based on the deal with the attention. FFM(feature fusion module)  is designed. The module fuses high-level semantic features with shallow features. The performance of the network is evaluated. The network test mIoU is 79.65% and the FPS is 59.6. The experiment is carried out on the road, and the segmentation effect is fast and good.
    Reference | Related Articles | Metrics
    Image Enhancement and Improved Marine Biological Image Detection Algorithm
    GUO Pingxiu, LI Qinan, YANG Zhongpeng
    Computer Engineering and Applications    2023, 59 (8): 208-216.   DOI: 10.3778/j.issn.1002-8331.2112-0556
    Abstract81)      PDF(pc) (821KB)(42)       Save
    In order to improve the detection accuracy of marine biological images, this paper uses optimized MSRCR to enhance marine biological images, and proposes an improved YOLOv4 algorithm(IYOLOv4) based on ASFF and Focal Loss. First of all, for light propagating in seawater, the strong attenuation of red light leads to the problem of low contrast and color shift in marine biological images. It uses bilateral filtering instead of Gaussian filtering in the traditional MSRCR, which not only preserves more image boundary features, but also solves the problem of image color shift by increasing the red in the image, at the same time the local contrast of the image is also improved. Secondly, the algorithm uses the ASFF structure to make full use of the semantic information of the high-level features of the image and the fine-grained features of the bottom layer, and fully integrates the features by learning the weight parameters to enhance the fusion effect. Finally, the BCE Loss used in the classification loss of YOLOv4 is replaced with Focal Loss to solve the problem of unbalanced categories in the dataset and improve detection accuracy. The experimental results show that compared with the YOLOv4 algorithm, the four classes of AP of holohurian, scallop, starfish, and echinus increases by 10.35, 9.13, 2.22, 0.14 percentage points respectively, and mAP increases by 5.45 percentage points.
    Reference | Related Articles | Metrics
    Improved SegFormer Network Based Method for Semantic Segmentation of Remote Sensing Images
    TIAN Xuewei, WANG Jiali, CHEN Ming, DU Shouqing
    Computer Engineering and Applications    2023, 59 (8): 217-226.   DOI: 10.3778/j.issn.1002-8331.2204-0141
    Abstract99)      PDF(pc) (951KB)(49)       Save
    Existing segmentation algorithms have difficulties to accurately segment small objects and object boundaries on remote sensing images, due to the multiple object scales and insufficient semantic information of small objects on remote sensing images. Therefore, an improved SegFormer network semantic segmentation method for remote sensing images is proposed, which combines the features of multiple scales output by the SegFormer encoder in a cascaded manner. When merging high-level semantic information features, the semantic feature fusion module is used to preserve the fuzzy boundaries; when merging detailed information features, the gated attention mechanism module is used to filter some high-level semantic information features to reduce their interference to the detailed information features. After that, the features of multiple scales are up-sampled and connected, and the multi-local channel attention module is used to recalibrate the mapping relationship of the connected features according to the channel context to enhance the final segmentation effect. The experimental results on UAVid and ISPRS Potsdam datasets show that the improved SegFormer segmentation method is better than the current mainstream segmentation methods compared, and has better semantic segmentation effect on small objects and boundaries in remote sensing images.
    Reference | Related Articles | Metrics
    Multi-Head Attention Detection of Small Targets in Remote Sensing at Multiple Scales
    ZHANG Zhaoyang, ZHANG Shang, WANG Hengtao, RAN Xiukang
    Computer Engineering and Applications    2023, 59 (8): 227-238.   DOI: 10.3778/j.issn.1002-8331.2210-0366
    Abstract96)      PDF(pc) (1541KB)(46)       Save
    For the targets to be detected in complex geospatial remote sensing images, there are problems such as multi-scale characteristics, morphological changes, and too few small target discriminative features, resulting in low detection and recognition accuracy. This paper proposes a multi-scale object detection algorithm for remote sensing small objects based on multi-head attention YOLO-StrVB. First, it reconstructs the network structure, builds a multi-scale network model, adds a target detection layer, and improves the detection ability of the remote sensing small target model under the feature extraction network at different scales. Then, a bidirectional feature pyramid network(Bi-FPN) is added for multi-scale feature fusion to improve bidirectional cross-scale connections and weighted feature fusion. Secondly, the multi-head attention mechanism block of Swin Transformer is integrated at the end of the YOLOv5 network to improve the multi-scale fusion relationship of the receptive field to adapt to the target recognition task, and optimize the backbone network. Finally, it uses Varifocal Loss to train the network to improve the confidence and positioning accuracy of remote sensing dense detection small targets, and selects CIOU as the loss function of the border frame regression to improve the backing accuracy of the frame of perception classification(IACS). Through experimental verification on the remote sensing target dataset NWPU VHR-10, the mAP compared with the original YOLOv5 model is increased by 3.05 percentage points, which can effectively improve the detection accuracy of small targets and achieve the robustness of small target detection in geospatial remote sensing images.
    Reference | Related Articles | Metrics
    Dehazing Object Tracking Algorithm Using Dark Channel Prior
    CHANG Jiashun, SUN Lifan, YANG Zhe, ZHANG Jinjin, FU Zhumu
    Computer Engineering and Applications    2023, 59 (8): 239-246.   DOI: 10.3778/j.issn.1002-8331.2112-0360
    Abstract67)      PDF(pc) (838KB)(39)       Save
    For tracking drift problem caused by the degradation of image quality and fuzzy background information in the haze scene, in the framework of siamese networks, combined with dark channel prior, this paper proposes a new dehazing object tracking algorithm. This algorithm dehazes the template image and the search image and extracts its features by using convolutional neural network. Lastly, according to the matching degree of its feature similarity, it estimates the target location. In addition, for the problem of insufficient data sets in haze scenes, this paper synthesizes artificially haze data sets (OTB-H, OTB-M, OTB-L) on the basis of the existing test data set OTB100. Finally, the proposed algorithm is compared with existing algorithms on each data set. The experimental results demonstrate that the proposed algorithm has better tracking performance in haze scenarios. The tracking accuracy is 0.713 under OTB-H data set and the tracking success rate is 0.519. Compared to SiamFC, tracking accuracy is improved by 35.6% and success rate is improved by 33.4%, and it meets the real-time requirements.
    Reference | Related Articles | Metrics
    Dense Object Detection in Remote Sensing Images Under Complex Background
    LI Abiao, GUO Hao, QI Chang, AN Jubai
    Computer Engineering and Applications    2023, 59 (8): 247-253.   DOI: 10.3778/j.issn.1002-8331.2112-0417
    Abstract82)      PDF(pc) (683KB)(48)       Save
    The objects in remote sensing images are densely packed, and common detection algorithms face challenges in differentiating them. At the same time, the background of the target is complex, resulting in high-response background noise in the feature map generated by the model, which can lead to dissatisfied detection results. An object detection algorithm based on the CenterNet network with optimized weight distribution is proposed to resolve the aforementioned issues. To begin with, an optimized weight distribution strategy is designed to enlarge the loss returned from the target edge region during the calculation process of heatmap loss, encouraging the network to pay close attention to the edge of the dense target and lowering the probability of the algorithm recognizing the dense target as a single target. Second, a semantic segmentation module is incorporated into the CenterNet network structure, training the network model to learn the segmentation map of each target and applying the segmentation map predicted by the model to suppress the high-response background noise in the feature map. Experiments are carried out on the DOTA dataset, and the proposed method outperforms the previous algorithm with mean average accuracy(mAP) of 68.56%. Compared with the original CenterNet algorithm, mAP has improved by 6.53 percentage points. Experimental results show that the improved CenterNet algorithm can better adapt to the detection of dense targets distributed in complex backgrounds.
    Reference | Related Articles | Metrics
    Adversarial Semi-Supervised Semantic Segmentation with Attention Mechanism
    YUN Fei, YIN Yanjun, ZHANG Wenxuan, ZHI Min
    Computer Engineering and Applications    2023, 59 (8): 254-262.   DOI: 10.3778/j.issn.1002-8331.2112-0484
    Abstract101)      PDF(pc) (702KB)(44)       Save
    Image semantic segmentation is one of the most important research topics in computer vision. The current semantic segmentation algorithm based on full convolutional neural network has some problems, such as lack of correlation between pixels, convolution kernel receptive field smaller than the theoretical value, and high label cost of manually labeled data set. In order to solve the above problems, an antithesis semi-supervised semantic segmentation model integrating attention mechanism is proposed. The generative adversarial network is applied to image semantic segmentation to enhance the correlation between pixels. In this model, self-attention module and multi-core pooling module are added to generate network to fuse long distance semantic information, and the convolution kernel receptive field is enlarged. A large number of experiments are carried out on PASCAL VOC2012 enhanced dataset and Cityscapes dataset, and the experimental results prove the validity and reliability of the proposed method for image semantic segmentation.
    Reference | Related Articles | Metrics
    Siamese Network Weak Target Tracking Algorithm Fused with Location Information Attention
    WEI Jian, ZHAO Xu, LI Lianpeng
    Computer Engineering and Applications    2023, 59 (7): 198-206.   DOI: 10.3778/j.issn.1002-8331.2112-0076
    Abstract59)      PDF(pc) (847KB)(34)       Save
    The weak feature target tracking of the classic siamese network has the problem of poor robustness. For this reason, a siamese network algorithm that integrates the attention mechanism of the target’s two-dimensional position information is designed. This algorithm is based on the siamese region proposal network(SiamRPN) target tracking algorithm, introduces the location information attention module in the feature extraction network part to extract the two-dimensional location information of the target feature to calculate the weight of the feature channel, and improves the feature extraction ability of the network for weak targets. In the feature extraction backbone network part, lightweight deep feature extraction backbone network MobileNetV2 is used instead of AlexNet, which reduces the model parameters and the amount of calculation while improving the feature extraction capabilities of the backbone network. In the similarity measurement module, the similarity measurement method of multi-layer feature fusion is adopted to make full use of the position information of the shallow features of the deep network and the semantic information of the deep features to strengthen the tracking accuracy and positioning accuracy of the algorithm. Experimental results show that compared with the basic algorithm, the success rate of this algorithm is increased by 12.6%, and the precision is increased by 8.4%. The tracking speed reaches 74?frames per second, which meets the real-time requirements.
    Reference | Related Articles | Metrics
    RGB-D Saliency Detection Based on Multi-Level Feature Fusion
    SHI Yue, YU Wanjun, CHEN Ying
    Computer Engineering and Applications    2023, 59 (7): 207-213.   DOI: 10.3778/j.issn.1002-8331.2112-0219
    Abstract49)      PDF(pc) (795KB)(23)       Save
    When most RGB-D saliency detection methods explore the cross-modal information of each layer, they often directly merge the depth map with the RGB map without processing, and use the same fusion strategy at each level. However, this will cause two problems:(1) Low-quality depth maps will bring a lot of redundant information into the network, which will have a negative impact on detection; (2) The same fusion strategy is adopted at all levels, which ignores that the model has different degrees of attention to global and local features at different levels. In order to solve the above problems, a top-down multi-level feature fusion structure is proposed. The low-quality depth map information is effectively filtered through the design of the depth enhancement module. The high-level fusion module is designed to effectively integrate the global features in the high-level. The low-level fusion module is designed to effectively extract and merge useful local features. Comprehensive experiments on five public datasets and seven advanced models show that the method in this paper has superiority in the four indicators of F value, avgF value, S value and MAE.
    Reference | Related Articles | Metrics
    Improved YOLOv5 Smoke Detection Model
    ZHENG Yuanpan, XU Boyang, WANG Zhenyu
    Computer Engineering and Applications    2023, 59 (7): 214-221.   DOI: 10.3778/j.issn.1002-8331.2209-0421
    Abstract96)      PDF(pc) (675KB)(76)       Save
    Aiming at the problems of complex smoke occurrence scene and difficult smoke detection of small targets, an improved YOLOv5 smoke detection model is proposed. Firstly, in order to increase the detection accuracy of the target smoke, the feature fusion process is modified by combining the weighted bidirectional feature Pyramid network(BiFPN) structure, and the mixed attention mechanism is added to the channel and spatial dimensions to reassign the weight of the fused feature map. In enhanced characteristics of the smoke target inhibition has nothing to do with regional characteristics at the same time, the smoke has higher robustness characteristics of expression. Secondly, α-CIOU is used to replace G-IOU as the prediction box regression loss to improve the prediction accuracy of the prediction box. The classification loss is removed to reduce the complexity of the model. Experimental results show that the improved YOLOv5 smoke detection model has higher detection accuracy than the YOLOv5 model, with an accuracy of 99.35% and a recall rate of 99.18% and a detection speed of 46 frame/s. The proposed algorithm can effectively extract the overall characteristics of smoke, which is more suitable for smoke detection tasks in complex scenes and small targets.
    Reference | Related Articles | Metrics
    Dual-Modal Feature Fusion Semantic Segmentation of RGB-D
    LUO Penlin, FANG Yanhong, LI Xin, LI Xue
    Computer Engineering and Applications    2023, 59 (7): 222-231.   DOI: 10.3778/j.issn.1002-8331.2111-0518
    Abstract46)      PDF(pc) (926KB)(17)       Save
    The existing RGB image semantic segmentation network for complex indoor scenes is susceptible to factors such as color and lighting, while it is also challenging to integrate dual-modal features effectively. Regarding the issue indicated above, this paper proposes an attention mechanism bimodal fusion network(AMBFNet) that adopts an encoder-decoder structure. In the first phase, building the bimodal fusion network structure(AMBF) is carried out to reasonably allocate the location and channel information of the features at each stage of the encoding branch. And then, designing the DA-context module is implemented to merge the context information. Finally, the multi-scale feature maps are cross-layer fused through the decoder to reduce the problem of misrecognition between classes and the loss of small-scale targets in the prediction results. The test results on the two public datasets of SUN RGB-DNYU and Depth v2(NYUDV2) show the consequence that compared with the more advanced RGB-D semantic segmentation network such as the RedNet, ACNet and ESANet, under the same hardware conditions, the network proposed in this paper has better segmentation performance. At the same time, the MIoU reaches 47.9% and 50.0%, respectively.
    Reference | Related Articles | Metrics
    Improved YOLOv5 Lightweight Mask Detection Algorithm
    LIU Chonghao, PAN Lihu, YANG Fan, ZHANG Rui
    Computer Engineering and Applications    2023, 59 (7): 232-241.   DOI: 10.3778/j.issn.1002-8331.2209-0013
    Abstract91)      PDF(pc) (906KB)(89)       Save
    In order to improve the detection efficiency of existing mask detection algorithms, and reduce the parameters and model size, an improved lightweight mask detection algorithm YOLOv5-MBF is proposed. Firstly, the GELU activation function replaces the hard-swish activation function of MobileNetV3 deep network, which optimizes the convergence effect of the model, and the improved MobileNetV3 network replaces the YOLOv5s backbone network, which reduces the calculation amount and improves the speed of model detection. Secondly, the feature pyramid structure of BiFPN is added to fuse with different feature layers, which improves the detection accuracy. At the same time, Mosaic and Mixup data enhancement are used in data processing to improve the generalization and robustness of the model. Focal-Loss EIoU is used as the regression loss function, which optimizes the convergence speed of model training and improves the positioning accuracy of mask and face border. Finally, CBAM attention mechanism is added to make the model pay more attention to important features, suppress insignificant features and improve the detection performance. The experimental results show that the average accuracy of the algorithm is 89.5% on the mask-wearing target and the mask-not-wearing target, the model reasoning speed is increased by 43%, the model parameters are reduced by 49%, and the model size is reduced by 48%, which meets the real-time and detection accuracy requirements of mask detection tasks.
    Reference | Related Articles | Metrics
    Optimizing Human Abnormal Behavior Detection Method of YOLO Network
    ZHANG Hongmin, ZHAUNG Xu, ZHENG Jingtian, FANG Xiaobing
    Computer Engineering and Applications    2023, 59 (7): 242-249.   DOI: 10.3778/j.issn.1002-8331.2208-0061
    Abstract75)      PDF(pc) (614KB)(51)       Save
    Because of the large interference of environmental background information in public surveillance videos and the different scale of abnormal human behavior goals, at present, it is difficult to improve the precision of human abnormal behavior detection. For the above issues, this paper designs the abnormal behavior detection method by improving the YOLOv5 module. In this method, a shielded convolutional attention model is added to the original YOLOv5 backbone network. The module starts from a shielded convolutional layer, and the central region of the receptive field is covered. The shielding information is predicted and the errors related to the shielding information are used as abnormal scores. At the same time, Swin-CA module is embedded in the detection network. Through the study of characteristics of adjacent layers, enables the module to get stronger grasp the overall situation information, thus reducing the affect of backdrop message on the detection results, by extracting the scale characteristics of human behavior abnormalities in different backgrounds, it decreases the order of complex of the whole model calculation and improves the precision of the module to locate the target of abnormal human behavior. Experimental results on the UCSD-PED1, KTH and Shanghai Tech datasets show that the precision of the proposed method reaches 98.2%, 96.4% and 95.8%, respectively.
    Reference | Related Articles | Metrics
    Marker-Assisted Multi-Feature Fusion Localization Method
    LIU Jiamin, CHEN Shenglun, WANG Zhihui, LI Haojie
    Computer Engineering and Applications    2023, 59 (6): 187-195.   DOI: 10.3778/j.issn.1002-8331.2111-0316
    Abstract60)      PDF(pc) (739KB)(27)       Save
    Monocular simultaneous localization and mapping accuracy relies on the feature extraction and association algorithm in image. The estimated trajectory will draft because of the error accumulation. Aiming at the problem, this paper proposes a marker-assisted multi-feature fusion localization algorithm, which combines the environment planar structure where the markers are located to assist localization. The algorithm uses point, marker and plane features to improve the accuracy of pose estimation. Marker feature provides more robust points. Plane feature uses fewer parameters to represent larger structure, reducing the impact of occlusion on feature matching. It establishes the relationships between markers by the planar structures, which can make markers more satisfy the geometric position relationship between each other in the optimization, thereby reducing the drift caused by the accumulated error. Experimental results show that the algorithm can effectively track in the environment containing markers, and can better correct the loop in difficult environments. Compared with similar methods, the accuracy is significantly improved.
    Reference | Related Articles | Metrics
    Feature-Balanced UAV Aerial Image Target Detection Algorithm
    XU Jian, XIE Zhengguang, LI Hongjun
    Computer Engineering and Applications    2023, 59 (6): 196-203.   DOI: 10.3778/j.issn.1002-8331.2111-0075
    Abstract86)      PDF(pc) (662KB)(42)       Save
    Small target and large change of image angle in UAV aerial image result in poor object detection effect. To solve this problem, a network for UAV small target detection is designed. The problem that the target feature is difficult to extract due to the sharp change of the aerial image target’s perspective can be solved by the deformable convolution module in the network which improve the feature extraction ability for multi-view targets. The features of the low-level small targets in the network can be enhanced by the feature balance pyramid module, so as to solve the problem of poor detection effects of small targets in aerial images on account of their easy loss of features. At the same time, pixel un-shuffle is used to construct the bottom-level large-scale features to solve the problem of the large-scale convolution of the bottom-level features of the feature balance pyramid module. Cross self-attention mechanism is used for obtaining the object context information so that the problem of missed detection and error detection under severe conditions can be solved. Simulation results on public data sets show that the average accuracy of the proposed algorithm is better than that of the mainstream detection algorithms under the condition of real-time detection.
    Reference | Related Articles | Metrics
    Real-World Image Super-Resolutioin Based on Noise and U-Shape Discrimination Network
    LI Hao, YANG Zhijing, WANG Meilin, LING Wing-Kuen
    Computer Engineering and Applications    2023, 59 (6): 204-211.   DOI: 10.3778/j.issn.1002-8331.2111-0393
    Abstract46)      PDF(pc) (3318KB)(16)       Save
    Previous image super-resolution algorithms based on convolutional neural networks are usually trained on synthetic ideal datasets, and their performance will drop dramatically in real-world scenarios. To better extract the original feature information in real-world images and model their degradation process, this paper proposes a real-world image super-resolution reconstruction based on noise and U-shape discrimination network. To make the degraded image having a similar feature distribution to the source image and restoring more detailed information and better visual quality, this paper uses the complex noise information directly collected from the source real-world image to inject the synthetic degraded image. In addition, this paper proposes to employ a U-shape discrimination network with spectral normalization to increase the discrimination network capability and stabilize the training, and suppress the appearance of artifacts in image reconstruction. Experimental results on three benchmark datasets show that compared with the state-of-the-art methods, this model achieves the best results in all three evaluation protocols (peak signal-to-noise ratio(PSNR), structural similarity(SSIM), and learned perceptual image patch similarity(LPIPS)) and has better visual quality.
    Reference | Related Articles | Metrics
    Multi-Scale Attention Refinement Retinal Segmentation Algorithm
    LIANG Liming, CHEN Xin, YU Jie, ZHOU Longsong
    Computer Engineering and Applications    2023, 59 (6): 212-220.   DOI: 10.3778/j.issn.1002-8331.2111-0514
    Abstract61)      PDF(pc) (777KB)(36)       Save
    Aiming at the problems of unsegmented small blood vessels and over-segmented pathological areas due to the small size of retinal blood vessels and low contrast in existing algorithms, a multi-scale attention thinning retinal segmentation algorithm based on U-shaped network is proposed. First of all, the improved dense convolution module is used in the encoding and decoding stages to fully extract the feature information of the blood vessel, and improve the utilization of features. Secondly, the results of the four feature extractions of the coding layers of different scales are spliced, and then transferred to the decoding layer through skip connections. At the same time, a dual attention mechanism is introduced in the skipping connection and spatial refinement structure to spatially enhance the structure of the tiny blood vessels. Finally, the spatial refinement module is introduced in the decoding to further extract the spatial information of the tiny blood vessels and refine the distribution and shape of the blood vessels. The algorithm is verified on the public data sets DRIVE and STARE. The evaluation indicators ACC are 0.964 9 and 0.966 3, the sensitivity is 0.842 2 and 0.805 0, the specificity is 0.982 2 and 0.988 0, and the AUC is 0.986 7 and 0.989 5.
    Reference | Related Articles | Metrics
    Wearing Mask Pedestrian Tracking Based on Improved YOLOv7 and DeepSORT
    ZHAO Yuanlong, SHAN Yugang, YUAN Jie
    Computer Engineering and Applications    2023, 59 (6): 221-230.   DOI: 10.3778/j.issn.1002-8331.2210-0479
    Abstract125)      PDF(pc) (1007KB)(63)       Save
    A pedestrian tracking algorithm based on improved YOLOv7 and DeepSORT is proposed to solve the problem that whether pedestrians wear masks cannot be correctly judged due to face occlusion and missed detection in video sequences. The algorithm combines mask detection, pedestrian detection and tracking. Firstly, by adding attention mechanism to the backbone network of YOLOv7, shallow feature maps are added to enhance the network’s ability to perceive small targets and improve the accuracy of mask detection and pedestrian detection. Secondly, the intra-frame relationship module uses the Hungarian algorithm to correlate the intra-frame targets and mark the mask wearing of pedestrians. Then, the direction difference factor is added to the association cost of the DeepSORT algorithm to eliminate the inconsistency between the historical detection direction and the new detection speed direction of the tracking trajectory. Finally, the improved DeepSORT algorithm is used to track pedestrians and update the mask wearing mark for each track, achieve tracking of pedestrians wearing masks and those not wearing masks. The experimental results show that the average detection accuracy mAP50 of the improved YOLOv7 network is 3.83 percentage points higher than that of the original algorithm. On the MOT16 dataset, the tracking accuracy MOTA of this algorithm is 17.1 percentage points higher than that of DeepSORT algorithm, and the tracking precision MOTP is increased by 2.6% percentage points. Compared with the detection algorithm, this algorithm can track more pedestrians whether wearing masks, and has better results.
    Reference | Related Articles | Metrics
    Research on Underwater Target Detection by Improved YOLOv3-SPP
    YE Zhaobing, DUAN Xianhua, ZHAO Chu
    Computer Engineering and Applications    2023, 59 (6): 231-240.   DOI: 10.3778/j.issn.1002-8331.2204-0264
    Abstract88)      PDF(pc) (792KB)(75)       Save
    To solve the problem of faulty and omitted detection that results from blurred images, complex backgrounds and small targets in underwater target detection tasks, an improved YOLOv3-SPP underwater target detection algorithm is proposed. Firstly, the original underwater image is recovered by UWGAN network, and the Mixup method is employed to strengthen the data and diminish the mislabeled memory. Secondly, the YOLOv3-SPP network structure is used as the basis to increase the network prediction scale to raise the small target detection performance. Then the CIoU border regression loss is introduced to improve the localization accuracy. Finally, the [K]-Means++ clustering algorithm is applied to filter the best Anchor box. The improved YOLOv3-SPP algorithm is experimented on the processed URPC dataset, and the average detection accuracy is improved from 79.58% to 88.71% with a speed of 28.9 FPS. The performance show that the improved algorithm has better comprehensive detection capability than other algorithms.
    Reference | Related Articles | Metrics