Most Read articles

    Published in last 1 year |  In last 2 years |  In last 3 years |  All

    Published in last 1 year
    Please wait a minute...
    For Selected: Toggle Thumbnails
    Research on Urban Logistics Distribution Mode of Bus-Assisted Drones
    PENG Yong, REN Zhi
    Computer Engineering and Applications    2024, 60 (7): 335-343.   DOI: 10.3778/j.issn.1002-8331.2212-0252
    Abstract705)      PDF(pc) (755KB)(584)       Save
    The rapid development of e-commerce forces the continuous transformation and upgrading of the logistics industry. In view of the fact that local governments encourage the development of public transport and advocate green and low-carbon logistics distribution mode, a distribution mode of bus-assisted drone is studied. After explaining the problem, a mathematical model with the lowest distribution cost is constructed, and a heuristic algorithm of smart general variable neighborhood search metaheuristic is designed to solve the problem. At the same time, in order to improve the efficiency of the algorithm, K-means clustering and greedy algorithm are introduced to generate the initial solution. Firstly, aiming at different scale examples, a variety of local search strategies and a variety of algorithms are compared to verify the effectiveness of the algorithm. Secondly, by selecting the standard CVRP as example, the single truck distribution mode and truck-drone collaborative distribution mode are compared with the distribution mode of bus-assisted drone to prove its cost and time advantages. Finally, Beijing Bus Rapid Transit Line 2 and its surrounding customer points are selected, and sensitivity analysis is made by changing the bus stop spacing and departure interval, result shows that the impact of increasing the stop spacing is greater than the change of departure interval.
    Reference | Related Articles | Metrics
    Improved YOLOv8s Model for Small Object Detection from Perspective of Drones
    PAN Wei, WEI Chao, QIAN Chunyu, YANG Zhe
    Computer Engineering and Applications    2024, 60 (9): 142-150.   DOI: 10.3778/j.issn.1002-8331.2312-0043
    Abstract573)      PDF(pc) (5858KB)(722)       Save
    Facing with the problems of small and densely distributed image targets, uneven class distribution, and model size limitation of hardware conditions, object detection from the perspective of drones has less precise results. A new improved model based on YOLOv8s with multiple attention mechanisms is proposed. To solve the problem of shared attention weight parameters in receptive field features and enhance feature extraction ability, receptive field attention convolution and CBAM (concentration based attention module) attention mechanism are introduced into the backbone, adding attention weight in channel and spatial dimensions. By introducing large separable kernel attention into feature pyramid pooling layers, information fusion between different levels of features is increased. The feature layers with rich semantic information of small targets are added to improve the neck structure. The inner-IoU loss function is used to improve the MPDIoU (minimum point distance based IoU) function and the inner-MPDIoU instead of the original loss function is used to enhance the learning ability for difficult samples. The experimental results show that the improved YOLOv8s model has improved mAP, P, and R by 16.1%, 9.3%, and 14.9% respectively on the VisDrone dataset, surpassing YOLOv8m in performance and can be effectively applied to unmanned aerial vehicle visual detection tasks.
    Reference | Related Articles | Metrics
    Improved YOLOv8 Object Detection Algorithm for Traffic Sign Target
    TIAN Peng, MAO Li
    Computer Engineering and Applications    2024, 60 (8): 202-212.   DOI: 10.3778/j.issn.1002-8331.2309-0415
    Abstract396)      PDF(pc) (937KB)(332)       Save
    Although the current testing technology is becoming increasingly mature, the detection of small targets in complex environments is still the most difficult point in research. Aiming at the problem of high target proportion of traffic signs in road traffic scenarios, the problem of high target proportion of small targets and large environmental interference factors, it proposes a type of road traffic logo target test algorithm based on YOLOv8 improvement. Due to the prone to missed inspection in small target testing, the bi-level routing attention (BRA) attention mechanism is used to improve the network’s perception of small targets. In addition, it also uses a shape-changing convolutional module deformable convolution V3 (DCNV3). It has a better feature extraction ability for irregular shapes in the feature map, so that the backbone network can better adapt to irregular space structures, and pay more accurately to important attention,objectives, thereby improving the detection ability of the model to block the overlapping target. Both DCNV3 and BRA modules improve the accuracy of the model without increasing the weight of the model. At the same time, the Inner-IOU loss function based on auxiliary border is introduced. On the four data sets of RoadSign, CCTSDB, TSDD, and GTSDB, small sample training, large sample training, single target detection, and multi-target detection are performed. The experimental results are improved. Among them, the experiments on the RoadSign data set are the best. The average accuracy of the improved YOLOv8 model mAP50 and mAP50:95 reaches 90.7% and 75.1%, respectively. Compared with the baseline model, mAP50 and mAP50:95 have increased by 5.9 and 4.8 percentage points, respectively. The experimental results show that the improved YOLOV8 model effectively implements the traffic symbol detection in complex road scenarios.
    Reference | Related Articles | Metrics
    Small Sample Steel Plate Defect Detection Algorithm of Lightweight YOLOv8
    DOU Zhi, GAO Haoran, LIU Guoqi, CHANG Baofang
    Computer Engineering and Applications    2024, 60 (9): 90-100.   DOI: 10.3778/j.issn.1002-8331.2311-0070
    Abstract379)      PDF(pc) (5010KB)(413)       Save
    The surface area of steel plate is large, and the surface defects are very common, and showing the characteristics of multi-class and small amount. Deep learning is difficult to be effectively applied to the detection of such small sample defects. In order to solve this problem, a small sample steel plate defect detection algorithm based on lightweight YOLOv8 is proposed. Firstly, an interactive data augmentation algorithm based on fuzzy search is proposed, which can effectively solve the problem that the network model cannot be effectively trained due to the lack of training samples, making it possible for deep learning to be applied in this field. Then, the LMRNet (lightweight multi-scale residual networks) network is designed to replace the backbone of YOLOv8, to achieve the lightweight of the network model and improve its portability. Finally, the CBFPN (context bidirectional feature pyramid network) and ECSA (efficient channel spatial attention) modules are proposed to make the network more effective in extracting and fusing scar features, and the Wise-IoU loss function is adopted to improve the detection performance. The comparative experimental results show that compared with the original YOLOv8 algorithm, the amount of parameters of the improved network is only 30% of the original network, the amount of calculation is 49% of the original network, the FPS is increased by 9 frame/s. The accuracy rate, recall rate and mAP have increased by 2.9, 6.5 and 5.5 percentage points respectively. Experimental results fully verify the advantages of the proposed algorithm.
    Reference | Related Articles | Metrics
    Improved YOLOv8 Multi-Scale and Lightweight Vehicle Object Detection Algorithm
    ZHANG Lifeng, TIAN Ying
    Computer Engineering and Applications    2024, 60 (3): 129-137.   DOI: 10.3778/j.issn.1002-8331.2309-0145
    Abstract362)      PDF(pc) (713KB)(337)       Save
    To address issues such as high hardware requirements, low detection accuracy, and a high rate of missed overlapping targets in traditional vehicle object detection models, a modified vehicle object detection algorithm called RBT-YOLO based on YOLOv8 is proposed. The main network is reconstructed using a multi-scale fusion approach. BiFPN is improved by adding convolutional operations and adjusting input/output channel numbers to adapt to YOLOv8, enhancing its feature fusion capability. After the feature maps are output from the Neck section, a lightweight attention mechanism called Triplet Attention is introduced to enhance the feature extraction ability of the model. To address the issue of high target overlap in real scenarios, SoftNMS (soft non-maximum suppression) is used to replace the original NMS, making the model to handle the candidate boxes more gentle, thereby strengthening detection capabilities of the model and improving recall rates. Experimental results on the Pascal VOC and MS COCO datasets demonstrate that the proposed RBT-YOLO outperforms the original model, reducing parameters and computations by approximately 60%, the mAP improved by 2.6 and 3.0 percentage points, and excelling in both size and precision compared to other classic detection models, thus demonstrating strong practical utility.
    Reference | Related Articles | Metrics
    Algorithm for Real-Time Vehicle Detection from UAVs Based on Optimizing and Improving YOLOv8
    SHI Tao, CUI Jie, LI Song
    Computer Engineering and Applications    2024, 60 (9): 79-89.   DOI: 10.3778/j.issn.1002-8331.2312-0291
    Abstract342)      PDF(pc) (4614KB)(411)       Save
    To address the problems of low accuracy, easy interference from background environment and difficulty in detecting small target vehicles of existing UAV vehicle detection algorithms, an improved UAV vehicle detection algorithm YOLOv8-CX is proposed based on YOLOv8. By integrating the advantages of Deformable Convolutional Networks v1-3, a C2f-DCN module is proposed to flexibly sample features and better extract features between vehicles of different sizes. Utilizing the idea of large separable kernel attention, a SPPF-LSKA module is proposed with long-range dependency and self-adaptability, which can effectively reduce background interference on vehicle detection. In the neck network, a CF-FPN (ment network for tiny object deteciton) feature fusion structure is adopted to enhance the detection accuracy of small targets by combining contextual information and suppressing conflicts between features at different scales. Finally, the original YOLOv8 head is replaced with a Dynamic Head detection head. By unifying scale, space and task, the three types of attention mechanisms, the model detection performance is further improved. Experimental results show that on the Mapsai dataset, compared with the original algorithm, the improved algorithm increases the accuracy (P), recall (R) and mean average precision (mAP) by 8.5, 11.2 and 6.2 percentage points respectively, and the algorithm detection speed reaches 72.6 FPS, meeting the real-time requirements of UAV vehicle detection. By comparing with other mainstream target detection algorithms, the effectiveness and superiority of this method are validated.
    Reference | Related Articles | Metrics
    Survey on Video-Text Cross-Modal Retrieval
    CHEN Lei, XI Yimeng, LIU Libo
    Computer Engineering and Applications    2024, 60 (4): 1-20.   DOI: 10.3778/j.issn.1002-8331.2306-0382
    Abstract324)      PDF(pc) (3662KB)(293)       Save
    Modalities define the specific forms in which data exist. The swift expansion of various modal data types has brought multimodal learning into the limelight. As a crucial subset of this field, cross-modal retrieval has achieved noteworthy advancements, particularly in integrating images and text. However, videos, as opposed to images, encapsulate a richer array of modal data and offer a more extensive spectrum of information. This richness aligns well with the growing user demand for comprehensive and adaptable information retrieval solutions. Consequently, video-text cross-modal retrieval has emerged as a burgeoning area of research in recent times. To thoroughly comprehend video-text cross-modal retrieval and its state-of-the-art developments, a methodical review and summarization of the existing representative methods is conducted. Initially, the focus is on analyzing current deep learning-based unidirectional and bidirectional video-text cross-modal retrieval methods. This analysis includes an in-depth exploration of seminal works within each category, highlighting their strengths and weaknesses. Subsequently, the discussion shifts to an experimental viewpoint, introducing benchmark datasets and evaluation metrics specific to video-text cross-modal retrieval. The performance of several standard methods in benchmark datasets is compared. Finally, the application prospects and future research challenges of video- text cross-modal retrieval are discussed.
    Reference | Related Articles | Metrics
    Review of Deep Learning Methods Applied to Medical CT Super-Resolution
    TIAN Miaomiao, ZHI Lijia, ZHANG Shaomin, CHAO Daifu
    Computer Engineering and Applications    2024, 60 (3): 44-60.   DOI: 10.3778/j.issn.1002-8331.2303-0224
    Abstract320)      PDF(pc) (867KB)(255)       Save
    Image super resolution (SR) is one of the important processing methods to improve image resolution in the field of computer vision, which has important research significance and application value in the field of medical image. High quality and high-resolution medical CT images are very important in the current clinical process. In recent years, the technology of medical CT image super-resolution reconstruction based on deep learning has made remarkable progress. This paper reviews the representative methods in this field and systematically reviews the development of medical CT image super-resolution reconstruction technology. Firstly, the basic theory of SR is introduced, and the commonly used evaluation indexes are given. Then, it focuses on the innovation and progress of super-resolution reconstruction of medical CT images based on deep learning, and makes a comprehensive comparative analysis of the main characteristics and performance of each method. Finally, the difficulties and challenges in the direction of medical CT image super-resolution reconstruction are discussed, and the future development trend is summarized and prospected, hoping to provide reference for related research.
    Reference | Related Articles | Metrics
    Improved YOLOv8 Small Target Detection Algorithm in Aerial Images
    FU Jinyi, ZHANG Zijia, SUN Wei, ZOU Kaixin
    Computer Engineering and Applications    2024, 60 (6): 100-109.   DOI: 10.3778/j.issn.1002-8331.2311-0281
    Abstract313)      PDF(pc) (771KB)(277)       Save
    In aerial image detection task, object and the overall image size are small, scales have different characteristics and detail information is not clear, it can cause leak and mistakenly identified problems, an improved small target detection algorithm CA-YOLOv8 is proposed. Channel feature partial convolution (CFPConv) is designed. Based on this, it reconstructs a Bottleneck structure in C2f, which is named CFP_C2f. In this way, some C2f modules in YOLOv8 head and neck are replaced, the effective channel feature weights are enhanced, and the ability to obtain multi-scale detail features is improved. A context aggregated module (CAM) is embedded to improve the context aggregation ability, optimize the response of feature channels, and strengthen the ability to perceive the details of deep features. The NWD loss function is added and combined with CIoU as a positioning regression loss function to reduce the sensitivity of position bias. By making full use of the advantages of multiple attention mechanism, the original detection head is replaced with DyHead (dynamic head). In the experiment of VisDrone2019 dataset, the improved algorithm reduces the number of parameters by 33.3% compared with the original YOLOv8s model, and the detection accuracy of mAP50 and mAP50:95 increases by 8.7 and 5.7 percentage points respectively, showing good performance and confirming its effectiveness.
    Reference | Related Articles | Metrics
    Review on Human Action Recognition Methods Based on Multimodal Data
    WANG Cailing, YAN Jingjing, ZHANG Zhidong
    Computer Engineering and Applications    2024, 60 (9): 1-18.   DOI: 10.3778/j.issn.1002-8331.2310-0090
    Abstract296)      PDF(pc) (8541KB)(429)       Save
    Human action recognition (HAR) is widely applied in the fields of intelligent security, autonomous driving and human-computer interaction. With advances in capture equipment and sensor technology, the data that can be acquired for HAR is no longer limited to RGB data, but also multimodal data such as depth, skeleton, and infrared data. Feature extraction methods in HAR based on RGB and skeleton data modalities are introduced in detail, including handcrafted-based and deep learning-based methods. For RGB data modalities, feature extraction algorithms based on two-stream convolutional neural network (2s-CNN), 3D convolutional neural network (3DCNN) and hybrid network are analyzed. For skeleton data modalities, some popular pose estimation algorithms for single and multi-person are firstly introduced. The classification algorithms based on convolutional neural network (CNN), recurrent neural network (RNN), and graph convolutional neural network (GCN) are analyzed stressfully. A further comprehensive demonstration of the common datasets for both data modalities is presented. In addition, the current challenges are explored based on the corresponding data structure features of RGB and skeleton. Finally, future research directions for deep learning-based HAR methods are discussed.
    Reference | Related Articles | Metrics
    Baggage Tracking Technology Based on Improved YOLO v8
    CAO Chao, GU Xingsheng
    Computer Engineering and Applications    2024, 60 (9): 151-158.   DOI: 10.3778/j.issn.1002-8331.2310-0238
    Abstract291)      PDF(pc) (6479KB)(403)       Save
    In the airport baggage sorting scenario, the traditional multi-target tracking algorithm has the problems of high target ID switching rate and high false alarm rate of target trajectory. This paper presents a baggage tracking technique based on improved YOLO v8 and ByteTrack algorithms. The CBATM module is added, the ADH decoupling head is replaced and the loss function during training is changed, the detection accuracy is increased, the discrimination of target features is strengthened, and the ID switching rate of the target is reduced. GSI interpolation processing in Byte data association, which not only uses high box and low box, but also ensures the tracking effect after a long time of occlusion, and reduces the ID error switching caused by occlusion. In the airport baggage sorting dataset, MOTA and IDF 1 reach 89.9% and 90.3%, respectively, which show a significant improvement and can steadily realize the tracking of luggage ID.
    Reference | Related Articles | Metrics
    Survey of Few-Shot Image Classification Based on Deep Meta-Learning
    ZHOU Bojun, CHEN Zhiyu
    Computer Engineering and Applications    2024, 60 (8): 1-15.   DOI: 10.3778/j.issn.1002-8331.2308-0271
    Abstract291)      PDF(pc) (1091KB)(342)       Save
    Deep meta-learning has emerged as a popular paradigm for addressing few-shot classification problems. A comprehensive review of recent advancements in few-shot image classification algorithms based on deep meta-learning is provided. Starting from the problem description, the categorizes of the algorithms based on deep meta-learning for few-shot image classification are summarized, and commonly used few-shot image classification datasets and evaluation criteria are introduced. Subsequently, typical models and the latest research progress are elaborated in three aspects: model-based deep meta-learning methods, optimization-based deep meta-learning methods, and metric-based deep meta-learning methods. Finally, the performance analysis of existing algorithms on popular public datasets is presented, the research hotspots in this topic are summarized, and its future research directions are discussed.
    Reference | Related Articles | Metrics
    Vehicle Detection Algorithm Based on Improved YOLOv8 in Traffic Surveillance
    ZHOU Fei, GUO Dudu, WANG Yang, WANG Qingqing, QIN Yin, YANG Zhuomin, HE Haijun
    Computer Engineering and Applications    2024, 60 (6): 110-120.   DOI: 10.3778/j.issn.1002-8331.2310-0101
    Abstract277)      PDF(pc) (817KB)(285)       Save
    To address the current problems of insufficient vehicle detection accuracy and slow detection speed in complex traffic monitoring scenarios, a lightweight vehicle detection algorithm based on YOLOv8 model is proposed. Firstly, FasterNet is used to replace the backbone feature extraction network of YOLOv8, which reduces redundant computation and memory access, and improves the detection accuracy and inference speed of the model.Secondly, the SimAM attention module is added to the Backbone and Neck sections, which enhances the important features of the target vehicles without increasing the original network parameters, and improves the feature fusion capability. Then, to address the problem of poor detection of small-sized vehicles under dense traffic flow, a small target detection head is added to better capture the features and contextual information of small-sized vehicles. Finally, Wise-IoU, which can adaptively adjust the weight coefficients, is used as the loss function of the improved model, which enhances the regression performance of the bounding box and the robustness of the detection.The experimental results on the UA-DETRAC dataset show that compared with the original model, the improved method in this paper is able to achieve better detection accuracy and speed in the traffic monitoring system, with the mAP and FPS improved by 3.06 percengtage points and 3.36%, respectively, which effectively improves the problem of the poor detection of small-target vehicles in the complex traffic scenarios, and achieves a good balance between accuracy and speed.
    Reference | Related Articles | Metrics
    Review of Development of Deep Learning Optimizer
    CHANG Xilong, LIANG Kun, LI Wentao
    Computer Engineering and Applications    2024, 60 (7): 1-12.   DOI: 10.3778/j.issn.1002-8331.2307-0370
    Abstract274)      PDF(pc) (1327KB)(303)       Save
    Optimization algorithms are the most critical  factor in improving the performance of deep learning models, achieved by minimizing the loss function. Large language models (LLMs), such as GPT, have become the research focus in the field of natural language processing, the optimization effect of traditional gradient descent algorithm has been limited. Therefore, adaptive moment estimation algorithms have emerged, which are significantly superior to traditional optimization algorithms in generalization ability. Based on gradient descent, adaptive gradient, and adaptive moment estimation algorithms, and the pros  and cons of optimization algorithms are analyzed. This paper applies optimization algorithms to the Transformer architecture and selects the French-English translation task as the evaluation benchmark. Experiments have shown that adaptive moment estimation algorithms can effectively improve the performance of the model in machine translation tasks. Meanwhile, it discusses the development direction and applications of optimization algorithms.
    Reference | Related Articles | Metrics
    Review of Research on Artificial Intelligence in Traditional Chinese Medicine Diagnosis and Treatment
    SU Youli, HU Xuanyu, MA Shijie, ZHANG Yuning, Abudukelimu Abulizi, Halidanmu Abudukelimu
    Computer Engineering and Applications    2024, 60 (16): 1-18.   DOI: 10.3778/j.issn.1002-8331.2312-0400
    Abstract267)      PDF(pc) (6171KB)(272)       Save
    The field of traditional Chinese medicine (TCM) diagnosis and treatment is gradually moving towards standardization, objectification, modernization, and intelligence. In this process, the integration of artificial intelligence (AI) has greatly propelled the advancement of TCM diagnosis and treatment, scientific research, and TCM inheritance. The review starts from the current research status of AI in TCM, combs through the application and development of AI in TCM in three stages from expert system and rule engines, traditional machine learning algorithm to deep learning, and then summarizes the knowledge management tools and large language models of TCM in recent years. Finally, this paper analyzes the multiple challenges of data fairness, multimodal data understanding, model robustness, personalized medicine, and interpretability that exist at this stage of AI in TCM. To address these challenges, it is necessary to continuously explore and propose possible solutions to promote the in-depth development of intelligent TCM diagnosis and treatment, thus better meeting the health needs of people.
    Reference | Related Articles | Metrics
    Improved YOLOv8 Lightweight UAV Target Detection Algorithm
    HU Junfeng, LI Baicong, ZHU Hao, HUANG Xiaowen
    Computer Engineering and Applications    2024, 60 (8): 182-191.   DOI: 10.3778/j.issn.1002-8331.2310-0063
    Abstract266)      PDF(pc) (813KB)(343)       Save
    Aiming at the problem that UAV target detection algorithms are computationally complex and difficult to deploy, and the long-tailed distribution of UAV data leads to low detection accuracy, a lightweight UAV target detection algorithm based on improved YOLOv8 (PC-YOLOv8-n) is proposed, which can balance the network detection accuracy and computation, and has some generalisation ability to long-tailed distribution of data. Using partial convolutional layers (PConv) to replace the 3×3 convolutional layers in YOLOv8, the network is lightweighted to solve the problems of network redundancy and computational complexity; it fuses dual-channel feature pyramids, increases top-down paths, fusion of deep and shallow information, and introduces a lightweight attention mechanism in the same layer to improve the feature extraction ability of the network; it uses the equilibrium focus loss (EFL) as the category loss function to increase the category detection ability of the network by equalising the gradient weights of the tail categories during network training. The experimental results show that PC-YOLOv8-n has good performance in the VisDrone2019 dataset, improving 1.6 percentage points in mAP50 accuracy over the original YOLOv8-n algorithm, while the parameters and computation of the model are reduced to 2.6×106 and 7.6 GFLOPs, respectively, and the detection speed reaches 77.2 FPS.
    Reference | Related Articles | Metrics
    Review of Unsupervised Domain Adaptation in Medical Image Segmentation
    HU Wei, XU Qiaozhi, GE Xiangwei, YU Lei
    Computer Engineering and Applications    2024, 60 (6): 10-26.   DOI: 10.3778/j.issn.1002-8331.2307-0421
    Abstract264)      PDF(pc) (756KB)(246)       Save
    Medical image segmentation has broad application prospects in the field of medical image processing, providing auxiliary information for diagnosis and treatment by locating and segmenting interested organs, tissues, or lesion areas. However, there is a domain offset problem between different modalities of medical images, which can lead to a significant decrease in the performance of the segmentation model during actual deployment. Domain adaptation technology is an effective way to solve this problem, especially unsupervised domain adaptation, which has become a research hotspot in the field of medical image processing because it does not require target domain label information. At present, there are relatively few review reports on unsupervised domain adaptation research in medical image segmentation. Therefore, this paper summarizes, analyzes, and prospects the future of unsupervised domain adaptation research in medical image segmentation in recent years, hoping to help relevant researchers quickly understand and familiarize themselves with the current research status and trends in this field.
    Reference | Related Articles | Metrics
    Research Progress on Designing Lightweight Deep Convolutional Neural Networks
    ZHOU Zhifei, LI Hua, FENG Yixiong, LU Jianguang, QIAN Songrong, LI Shaobo
    Computer Engineering and Applications    2024, 60 (22): 1-17.   DOI: 10.3778/j.issn.1002-8331.2404-0372
    Abstract263)      PDF(pc) (6330KB)(332)       Save
    Lightweight design is a popular paradigm to address the dependence of deep convolutional neural network (DCNN) on device performance and hardware resources, and the purpose of lightweighting is to increase the computational speed and reduce the memory footprint without sacrificing the network performance. An overview of lightweight design approaches for DCNNs is presented, focusing on a review of the research progress in recent years, including two major lightweighting strategies, namely, system design and model compression, as well as an in-depth comparison of the innovativeness, strengths and limitations of these two types of approaches, and an exploration of the underlying framework that supports the lightweighting model. In addition, scenarios in which lightweight networks have been successfully applied are described, and predictions are made for the future development trend of DCNN lightweighting, aiming to provide useful insights and references for the research on lightweight deep convolutional neural networks.
    Reference | Related Articles | Metrics
    Small Object Detection Algorithm Based on ATO-YOLO
    SU Jia, QIN Yichang, JIA Ze, WANG Jing
    Computer Engineering and Applications    2024, 60 (6): 68-77.   DOI: 10.3778/j.issn.1002-8331.2308-0385
    Abstract262)      PDF(pc) (795KB)(241)       Save
    Small object detection is of great significance in the field of computer vision. However, existing methods often suffer from issues such as missed detection and false alarms when dealing with challenges like scale variation, dense object arrangement, and irregular layouts. To address these problems, ATO-YOLO, an improved version of the YOLOv5 algorithm is proposed. Firstly, this paper introduces an adaptive feature extraction (AFE) module that incorporates an attention mechanism to enhance the feature representation capability of the detection model. By dynamically adjusting the weight allocation to highlight key object features, AFE improves the accuracy and robustness of object detection tasks in various scenarios. Secondly, a triple feature fusion (TFF) mechanism is designed to effectively utilize multi-scale information by fusing feature maps from different scales, resulting in more comprehensive object features and enhanced detection performance for small objects. Lastly, an output reconstruction (ORS) module is introduced, which removes the large object detection layer and adds a small object detection layer, enabling precise localization and recognition of small objects. This module also reduces model complexity and improves detection speed compared to the original model. Experimental results demonstrate that the ATO-YOLO algorithm achieves an mAP@0.5 of 38.2% on the VisDrone dataset, a 6.1?percentage points improvement over YOLOv5, with a relative FPS increase of 4.4%. This algorithm enables fast and accurate detection of small objects.
    Reference | Related Articles | Metrics
    Research on Gesture Recognition Based on Improved YOLOv5 and Mediapipe
    NI Guangxing, XU Hua, WANG Chao
    Computer Engineering and Applications    2024, 60 (7): 108-118.   DOI: 10.3778/j.issn.1002-8331.2308-0097
    Abstract261)      PDF(pc) (686KB)(207)       Save
    The existing gesture recognition algorithms have the problems of large amounts of calculation and poor robustness. In this paper, a gesture recognition method based on IYOLOv5-Med (improved YOLOv5 Mediapipe) algorithm is proposed. This algorithm combines the improved YOLOv5 algorithm with the Mediapipe method, including gesture detection and gesture analysis. In the part of gesture detection, the traditional YOLOv5 algorithm is improved. Firstly, the C3 module is reconstructed by FastNet. Secondly, the CBS module is replaced by the GhostConv module in GhostNet. Thirdly, the SE attention mechanism module is introduced at the end of the Backbone network. The improved algorithm has a smaller model size and is more suitable for edge devices with limited resources. In the part of gesture analysis, a method based on Mediapipe is proposed. The key points of the hand are detected in the gesture area located in the gesture detection part, and the relevant features are extracted, and then identified by the naive Bayes classifier. The experimental findings affirm the efficacy of the IYOLOv5-Med algorithm introduced in this article. When compared to the conventional YOLOv5 algorithm, the parameters are reduced by 34.5%, the computations are reduced by 34.9%, and the model weight is decreased by 33.2%. The final average recognition rate reaches 0.997, and the implementation method is relatively simple, which has a good application prospect.
    Reference | Related Articles | Metrics
    Research Progress on Recommendation Algorithms with Knowledge Graph Visualization Analysis
    LIN Suqing, LUO Dingnan, ZHANG Shuhua
    Computer Engineering and Applications    2024, 60 (21): 1-17.   DOI: 10.3778/j.issn.1002-8331.2312-0032
    Abstract254)      PDF(pc) (1215KB)(313)       Save
    The application and proliferation of internet technology has caused an exponential growth in data, enhancing the complexity of information retrieval from massive datasets. Recommendation algorithms have attracted significant attention for alleviating information overload, with relevant research findings continually emerging. 4?773 Chinese and 4?531 English publications from 2012 to 2024 have been sourced from China National Knowledge Infrastructure (CNKI) and the Web of Science (WOS) core collection. Visualization tools CiteSpace and VOSviewer have been utilized to generate basic information and keyword co-occurrence graphs for literatures. Core technology keywords, including knowledge graph, graph neural network, and deep learning, have been extracted through graph analysis, and the corresponding representative recommendation algorithms have been selected. The core mechanisms and the underlying principles of the algorithms have been visually presented through charts, focusing on the limitations and challenges of existing research, as well as targeted solutions. Knowledge architecture diagrams have been developed for the algorithms associated with each core technology keyword, following the challenge-solution-source literature framework. The visualization of recommendation principles has been effectively implemented.
    Reference | Related Articles | Metrics
    Survey on Attack Methods and Defense Mechanisms in Federated Learning
    ZHANG Shiwen, CHEN Shuang, LIANG Wei, LI Renfa
    Computer Engineering and Applications    2024, 60 (5): 1-16.   DOI: 10.3778/j.issn.1002-8331.2306-0243
    Abstract252)      PDF(pc) (792KB)(271)       Save
    The attack and defense techniques of federated learning are the core issue of federated learning system security. The attack and defense techniques of federated learning can significantly reduce the risk of being attacked and greatly enhance the security of federated learning systems. Deeply understanding the attack and defense techniques of federated learning can advance research in the field and achieve its widespread application of federated learning. Therefore, it is of great significance to study the attack and defense techniques of federated learning. Firstly, this paper briefly introduces the concept, basic workflow, types, and potential existing security issues of federated learning. Subsequently, the paper introduces the attacks that the federated learning system may encounter, and relevant research is summarized during the introduction. Then, starting from whether the federated learning system has targeted defense measures, the defense measures are divided into two categories:universal defense measures and targeted defense measures, and targeted summary are made. Finally, it reviews and analyzes the future research directions for the security of federated learning, providing reference for relevant researchers in their research work on the security of federated learning.
    Reference | Related Articles | Metrics
    Review of Application of Visual Foundation Model SAM in Medical Image Segmentation
    SUN Xing, CAI Xiaohong, LI Ming, ZHANG Shuai, MA Jingang
    Computer Engineering and Applications    2024, 60 (17): 1-16.   DOI: 10.3778/j.issn.1002-8331.2401-0136
    Abstract251)      PDF(pc) (7912KB)(238)       Save
    With the continuous development of foundation models technology, visual foundation model represented by the segment anything model (SAM) has made significant breakthroughs in the field of image segmentation. SAM, driven by prompts, accomplishes a series of downstream segmentation tasks, aiming to address all image segmentation issues comprehensively. Therefore, the application of SAM in medical image segmentation is of great significance, as its generalization performance can adapt to various medical images, providing healthcare professionals with a more comprehensive understanding of anatomical structures and pathological information. This paper introduces commonly used datasets for image segmentation, provides detailed explanations of SAM’s network architecture and generalization capabilities. It focuses on a thorough analysis of SAM’s application in five major categories of medical images: whole-slide imaging, magnetic resonance imaging, computed tomography, ultrasound, and multimodal images. The review summarizes the strengths and weaknesses of SAM, along with corresponding improvement methods. Combining current challenges in the field of medical image segmentation, the paper discusses and anticipates future directions for SAM’s development.
    Reference | Related Articles | Metrics
    Lightweight Foggy Weather Object Detection Method Based on YOLOv5
    LAI Jing’an, CHEN Ziqiang, SUN Zongwei, PEI Qingqi
    Computer Engineering and Applications    2024, 60 (6): 78-88.   DOI: 10.3778/j.issn.1002-8331.2308-0029
    Abstract248)      PDF(pc) (1220KB)(236)       Save
    Aiming at the low accuracy and high model complexity of object detection algorithms in foggy scenes, a lightweight foggy object detection method based on YOLOv5 is proposed. Firstly, this paper adopts the receptive field attention module (RFAblock) to add an attention mechanism to the receptive field by interacting with the receptive field feature information to improve the feature extraction ability. Secondly, the lightweight network Slimneck is used as the neck structure to reduce the model parameters and complexity while maintaining the accuracy. The angle vector between the real frame and the predicted frame is introduced in the loss function to improve the training speed and inference accuracy. PNMS (precise non-maximum suppression) is used to improve the candidate frame selection mechanism and reduce the leakage detection rate in the case of vehicle occlusion. Finally, the experimental results are tested on the real foggy day dataset RTTS and the synthetic foggy day dataset Foggy Cityscapes, and the experimental results show that the mAP50 is improved by 4.9 and 3.5 percengtage points, respectively, compared with YOLOv5l, and the number of model parameters is only 54.6% of that of YOLOv5l.
    Reference | Related Articles | Metrics
    Algorithmic Research Overview on Graph Coloring Problems
    SONG Jiahuan, WANG Xiaofeng, HU Simin, JIA Jingwei, YAN Dong
    Computer Engineering and Applications    2024, 60 (18): 66-77.   DOI: 10.3778/j.issn.1002-8331.2403-0434
    Abstract241)      PDF(pc) (4612KB)(192)       Save
    The graph coloring problem (GCP) is a classic combinatorial optimization problem that has been widely applied in various fields such as mathematics, computer science, and biological science. Due to the NP hard nature of graph coloring problems, there is currently no precise algorithm in polynomial time to solve the problem. In order to provide an efficient algorithm for solving this problem, it is necessary to review the existing algorithms. It mainly divided into intelligent optimization algorithms, heuristic algorithms, reinforcement learning algorithms, etc., comparative analysis is carried out from the aspects of algorithm principles, improvement ideas, performance and accuracy, summarizing the advantages and disadvantages of algorithms, and pointing out the research direction and algorithm design path of GCP, which has guiding significance for the research of related problems.
    Reference | Related Articles | Metrics
    Survey of Vision Transformer in Low-Level Computer Vision
    ZHU Kai, LI Li, ZHANG Tong, JIANG Sheng, BIE Yiming
    Computer Engineering and Applications    2024, 60 (4): 39-56.   DOI: 10.3778/j.issn.1002-8331.2304-0139
    Abstract239)      PDF(pc) (3488KB)(179)       Save
    Transformer is a revolutionary neural network architecture initially designed for natural language processing. However, its outstanding performance and versatility have led to widespread applications in the field of computer vision. While there is a wealth of research and literature on Transformer applications in natural language processing, there remains a relative scarcity of specialized reviews focusing on low-level visual tasks. In light of this, this paper begins by providing a brief introduction to the principles of Transformer and analyzing several variants. Subsequently, the focus shifts to the application of Transformer in low-level visual tasks, specifically in the key areas of image restoration, image enhancement, and image generation. Through a detailed analysis of the performance of different models in these tasks, this paper explores the variations in their effectiveness on commonly used datasets. This includes achievements in restoring damaged images, improving image quality, and generating realistic images. Finally, this paper summarizes and forecasts the development trends of Transformer in the field of low-level visual tasks. It suggests directions for future research to further drive innovation and advancement in Transformer applications. The rapid progress in this field promises breakthroughs for computer vision and image processing, providing more powerful and efficient solutions for practical applications.
    Reference | Related Articles | Metrics
    Review of Deep Learning Methods for Palm Vein Recognition
    TAN Zhenlin, LIU Ziliang, HUANG Aiquan, CHEN Huihui, ZHONG Yong
    Computer Engineering and Applications    2024, 60 (6): 55-67.   DOI: 10.3778/j.issn.1002-8331.2306-0168
    Abstract236)      PDF(pc) (664KB)(135)       Save
    Palm vein recognition, as a new infrared biometrics technology, has become one of the research hotspots in the field of biometric recognition because of its advantages of high security and liveness detection. In recent years, a great deal of research in this field has promoted the development of palm vein recognition technology by introducing deep learning methods. In order to grasp the latest research status and development direction in the field of palm vein recognition, data acquisition and the mainstream algorithms of data pre-processing are classified and summarized, and the latest progress of palm vein recognition based on deep learning is classified and elaborated in terms of palm vein feature representation, network design and optimization, and lightweight network. In view of the bottleneck of single-modal recognition, the correlation algorithms of multimodal and multi-feature fusion recognition are analyzed and compared. The difficulties and challenges of current research on palm vein recognition are discussed, and the future development trends are prospected and summarized.
    Reference | Related Articles | Metrics
    Review of Deep Learning Approaches for Recognizing Multiple Unsafe Behaviors in Workers
    SU Chenyang, WU Wenhong, NIU Hengmao, SHI Bao, HAO Xu, WANG Jiamin, GAO Le, WANG Weitai
    Computer Engineering and Applications    2024, 60 (5): 30-46.   DOI: 10.3778/j.issn.1002-8331.2307-0168
    Abstract224)      PDF(pc) (808KB)(214)       Save
    With the development of deep learning, target detection and behavior recognition methods have made great progress in the field of worker unsafe behavior recognition, this paper systematically summarizes the relevant research work at home and abroad in recent years, elaborates the commonly used models and effects of target detection methods and behavior recognition methods, focuses on reviewing the application of the two types of methods in the recognition of unsafe behaviors and the relevant research on the combination of the two types of methods, and provides a comprehensive analysis and comparison on the advantages, limitations, recognized behavior categories and applicable scenarios of various methods are comprehensively analyzed and compared. On this basis, the optimization measures for target detection and behavior recognition in recent years are summarized, the commonly used optimization directions and means are summarized, the improvement methods successfully applied in unsafe behavior recognition are summarized, the difficulties and problems in this research field are sorted out, and the suggestions and future development trends are given, which will provide references and lessons for the research in this field.
    Reference | Related Articles | Metrics
    Research Progress of Image Style Transfer Based on Neural Network
    LIAN Lu, TIAN Qichuan, TAN Run, ZHANG Xiaohang
    Computer Engineering and Applications    2024, 60 (9): 30-47.   DOI: 10.3778/j.issn.1002-8331.2309-0204
    Abstract217)      PDF(pc) (7029KB)(254)       Save
    Image style transfer is the process of remapping the content of a specified image with a style image, which is a research hotspot in the field of artificial intelligence computer vision. Traditional image style transfer methods are mainly based on the synthesis of physical and texture techniques, and the style transfer effect is rough and less robust. With the emergence of image datasets and the proposal of various deep learning model networks, many models and algorithms for image style transfer have emerged. This paper analyzes the current status of image style transfer research, combs the development of image style transfer and the latest research progress, and gives the future research directions of image style transfer through comparative analysis.
    Reference | Related Articles | Metrics
    Research on Steel Surface Defect Detection with Improved YOLOv7 Algorithm
    GAO Chunyan, QIN Shen, LI Manhong, LYV Xiaoling
    Computer Engineering and Applications    2024, 60 (7): 282-291.   DOI: 10.3778/j.issn.1002-8331.2308-0414
    Abstract210)      PDF(pc) (1101KB)(176)       Save
    At present, the intelligent inspection technology based on deep learning is gradually applied to the field of steel surface defect detection. Aiming at the problem of low precision of steel surface defect detection, a high-precision and real-time defect detection algorithm CDN-YOLOv7 is proposed. Firstly, CARAFE lightweight up-sampling operator is added to improve the feature fusion capability of the network. Then, the YOLOv7 detection head network is redesigned by integrating the cascade attention mechanism and decoupling heads, aiming to solve the problem of low feature utilization efficiency of the original head network and make full use of multi-dimensional information of different scales, channels and spaces, improve the ability of model representation in complex scenarios. Finally, normalized Wasserstein distance is introduced to redesign Focal-EIoU loss function, and NF-EIoU is proposed to replace CIoU loss, balance the contribution of defect samples at different scales to loss, and reduce the missed detection rate of defects at different scales. The experimental results show that the detection accuracy of CDN-YOLOv7 can reach 80.3%, which is 6.0 percentage points higher than that of the original YOLOv7, and the model reasoning speed can reach 60.8 frame/s, meeting the real-time requirements. While improving the detection accuracy of defects at various scales, CDN-YOLOv7 significantly reduces the missed detection rate of defects.
    Reference | Related Articles | Metrics
    DY-YOLOv5:Target Detection for Aerial Image Based on Multiple Attention
    ZHAO Xin, CHEN Lili, YANG Weichuan, ZHANG Chengwang
    Computer Engineering and Applications    2024, 60 (7): 183-191.   DOI: 10.3778/j.issn.1002-8331.2309-0419
    Abstract205)      PDF(pc) (1074KB)(192)       Save
    Aiming at the problem of low detection accuracy caused by small targets, different scales and complex backgrounds in UAV aerial images, a target detection algorithm for UAV aerial images based on improved YOLOv5 is proposed. The algorithm introduces a target detection head method Dynamic Head with multiple attention mechanisms to replace the original detection head and improves the detection performance of the detection head in complex backgrounds. An upsampling and Concat operation is added to the neck part of the original model, and a multi-scale feature detection including minimal, small and medium targets is performed to improve the feature extraction ability of the model for medium and small targets. DenseNet is introduced and integrated with the C3 module of YOLOv5s backbone network to propose the C3_DenseNet module to enhance feature transfer and prevent model overfitting. The DY-YOLOv5 algorithm is applied to the VisDrone 2019 dataset, and the mean average precision (mAP) reaches 43.9%, which is 11.4 percentage points higher than the original algorithm. The recall rate (Recall) is 41.7%, which is 9.0 percentage points higher than the original algorithm. Experimental results show that the improved algorithm significantly improves the accuracy of target detection in UAV aerial images.
    Reference | Related Articles | Metrics
    Review of Text Classification Methods Based on Graph Neural Networks
    SU Yilei, LI Weijun, LIU Xueyang, DING Jianping, LIU Shixia, LI Haonan, LI Guanfeng
    Computer Engineering and Applications    2024, 60 (19): 1-17.   DOI: 10.3778/j.issn.1002-8331.2403-0142
    Abstract205)      PDF(pc) (3425KB)(240)       Save
    Text classification is an important task in the field of natural language processing, aiming to assign given text data to a predefined set of categories. Traditional text classification methods can only handle data in Euclidean space and cannot process non-Euclidean data such as graphs. For text data with graph structure, it is not directly processable and cannot capture the non-Euclidean structure in the graph. Therefore, how to apply graph neural networks to text classification tasks is one of the current research hotspots. This paper reviews the text classification methods based on graph neural networks. Firstly, it outlines the traditional text classification methods based on machine learning and deep learning, and summarizes the background and principles of graph convolutional neural networks. Secondly, it elaborates on the text classification methods based on graph neural networks according to different types of graph networks, and conducts an in-depth analysis of the application of graph neural network models in text classification. Then, it compares the current text classification models based on graph neural networks through comparative experiments and discusses the classification performance of the models. Finally, it proposes future research directions to further promote the development of this field.
    Reference | Related Articles | Metrics
    Survey of Neural Machine Translation
    ZHANG Junjin, TIAN Yonghong, SONG Zheyu, HAO Yufeng
    Computer Engineering and Applications    2024, 60 (4): 57-74.   DOI: 10.3778/j.issn.1002-8331.2305-0102
    Abstract203)      PDF(pc) (3432KB)(145)       Save
    Machine translation (MT) mainly studies how to translate the source language into the target language, which is of great significance for promoting the communication between nationalities. At present, neural machine translation (NMT) has become the mainstream MT method by translation speed and quality. In order to better sort out the context, this paper first introduces the history and methods of MT, compares and summarizes three main methods: rule-based machine translation, statistics-based machine translation and deep learning-based machine translation. Then NMT is introduced to explain its common types. Next, six main research fields of NMT are introduced, including multimodal MT, non-autoregressive MT, document-level MT, multilingual MT, data augmentation technology and preprocessing technique. Finally, the future of NMT is prospected from four aspects: low-resource languages, context-sensitive translation, unknown words and large models. This paper provides a systematic introduction to better understand the development status of NMT.
    Reference | Related Articles | Metrics
    Review of Research on Multimodal Retrieval
    JIN Tao, JIN Ran, HOU Tengda, YUAN Jie, GU Xiaozhe
    Computer Engineering and Applications    2024, 60 (5): 62-75.   DOI: 10.3778/j.issn.1002-8331.2305-0294
    Abstract201)      PDF(pc) (657KB)(172)       Save
    With the increasing of multimodal data, multimodal retrieval technology has received a lot of attention. With the introduction of computer and big data technology in automobile, medical and other industries, a large amount of industry data itself are presented in a multi-modal form. With the rapid development of the industry, people’s demand for information is constantly increasing, and single modal data retrieval can no longer meet people’s demand for information. In order to solve these problems and meet the needs of data retrieval from one mode and other modes, this paper studies multi-modal retrieval methods through literature review, analyzes different research methods such as common subspace, deep learning and multi-modal Hash algorithm, and sorts out the multi-modal retrieval techniques proposed by researchers in recent years to solve these problems. Finally, the multimodal retrieval methods proposed in recent years are evaluated and compared according to the accuracy, efficiency and characteristics of the retrieval. This paper analyzes the challenges encountered in multimodal retrieval and looks forward to the future application prospects of multimodal retrieval.
    Reference | Related Articles | Metrics
    Dense Small Object Detection Algorithm Based on Improved YOLOv5 in UAV Aerial Images
    CHEN Jiahui, WANG Xiaohong
    Computer Engineering and Applications    2024, 60 (3): 100-108.   DOI: 10.3778/j.issn.1002-8331.2306-0289
    Abstract198)      PDF(pc) (739KB)(220)       Save
    UAV aerial images have many instances of small objects, drastic size changes and dense occlusions, etc. To solve the difficulty of existing object detection algorithms to detect small objects in aerial images, an RDS-YOLOv5 detection algorithm for dense small objects is proposed. Adding a new small object detection layer to the three detection layers of YOLOv5 to retain richer feature information, the ability of the network is enhanced to extract small object features and reduce false and miss detection. A multi-scale feature extraction module C3Res2Block with a hierarchical residual structure is designed to improve the multi-scale feature representation capability of network as well as to suppress the generation of conflicts. Decoupled detection head is used to avoid the prediction bias caused by the difference between different tasks, which improves the localization and detection accuracy. The confidence of the anchor box is optimized using the Soft NMS algorithm to improve the detection accuracy of model for dense small objects. The experimental results of VisDrone dataset show that RDS-YOLOv5 improves 12.9 percentage points on mAP0.5 and 10.6 percentage points on mAP0.5:0.95 compared with the baseline model YOLOv5, and achieves better detection accuracy compared with the current mainstream object detection algorithms, which can effectively accomplish the task of dense small object detection in UAV aerial images.
    Reference | Related Articles | Metrics
    Survey on Distributed Assembly Permutation Flowshop Scheduling Problem
    ZHANG Jing, SONG Hongbo, LIN Jian
    Computer Engineering and Applications    2024, 60 (6): 1-9.   DOI: 10.3778/j.issn.1002-8331.2307-0276
    Abstract197)      PDF(pc) (619KB)(175)       Save
    As the rapid development of modern manufacturing, the past decades have witnessed a trend in which jobs are firstly processed in distributed production factories and then assembled into the final products in an assembly factory after completion. Such manufacturing mode brings many advantages as well as some new challenges on resource scheduling. This paper surveys literature on the distributed assembly permutation flowshop scheduling problem (DAPFSP). Firstly, the background and main issues in DAPFSP are introduced. Then, mathematical models, encoding and decoding schemes, and global and local search algorithms are thoroughly discussed for DAPFSP with the objective of minimizing the maximal completion time. Additionally, recent advances on DAPFSP with various objectives, such as total flow time, DAPFSP with other constraints like no-wait, and DAPFSP by taking issues including setup time into consideration are also surveyed. Finally, several future research directions worthy further investigation are pointed out.
    Reference | Related Articles | Metrics
    CIEFRNet:Abandoned Objects Detection Algorithm for Highway
    LI Xu, SONG Huansheng, SHI Qin, ZHANG Zhaoyang, LIU Zedong, SUN Shijie
    Computer Engineering and Applications    2024, 60 (5): 336-346.   DOI: 10.3778/j.issn.1002-8331.2306-0395
    Abstract196)      PDF(pc) (928KB)(113)       Save
    Highway abandoned objects endanger traffic safety, easily cause traffic accidents, so it is critical to recognize and clean them up in time. Due to the small area of highway abandoned objects in the image and complex image background, the existing detection methods often have the problems of missed and false detection. To address the above problems, an abandoned objects detection algorithm based on contextual information enhancement and feature refinement is proposed, which is called CIEFRNet. Firstly, a backbone feature extraction module (CSP-COT) incorporating contextual Transformer is designed to fully mine local static and global dynamic context information, and enhance the feature representation of small abandoned objects. In addition, the proposes improved spatial pyramid pooling (ISPP) is used in the backbone, multi-scale downsampling of features is realized by cascade dilated convolution, which reduces the loss of object detail information; in order to improve the feature fusion ability, a feature refine module (CNAB) is designed, in which a proposed mixed attention mechanism (ECSA) is embedded, which can suppress image background noise, and enhances the features of tiny abandoned objects. Finally, it uses the WIoU loss function based on dynamic non-monotonic focus mechanism to improve the learning ability of small abandoned objects and accelerate the network convergence. The experi-
    mental results demonstrate that the proposed method achieves 96.5%, 81.6%, 88.1% and 46.5% of accuracy, recall, AP0.5 and AP0.5:0.95 on the self-made highway abandoned objects dataset, respectively, which is better than the currently prevailing object detection methods, and its algorithm complexity is also lower to meet the needs of practical scene applications.
    Reference | Related Articles | Metrics
    Research Progress of Surface Electromyography Hand Motion Recognition
    LI Zhenjiang, WEI Dejian, FENG Yanyan, YU Fengfan, MA Yifan
    Computer Engineering and Applications    2024, 60 (3): 29-43.   DOI: 10.3778/j.issn.1002-8331.2305-0269
    Abstract193)      PDF(pc) (784KB)(169)       Save
    Surface electromyography (sEMG) is a non-invasive method of measuring muscle activity, which contains rich information related to human motion and can be used for hand motion recognition. Hand motion recognition based on sEMG refers to the classification and recognition of hand motions by analyzing the sEMG signals of the hand muscles. Driven by the development of neural networks, sEMG has made great progress in the field of hand motion recognition. However, sEMG is faced with defects such as high noise and poor stability, which cannot be efficiently utilised, bringing great difficulties in acquiring high-precision hand movement recognition models and has hindered the translation and application of research results. This paper summarizes the research progress of sEMG hand motion recognition methods in detail. Firstly, public EMG datasets commonly used in the field of action recognition are introduced, and the self-test EMG set acquisition process is described. Then the existing sEMG hand motion recognition models are classified into three categories according to the different research methods: hand motion recognition based on machine learning, hand motion recognition based on deep learning and hand motion recognition based on hybrid network structure, and the related models are summarised and analyzed respectively, and suggestions are made for the shortcomings. Finally, the problems to be solved and the future development direction of hand action recognition research are prospected.
    Reference | Related Articles | Metrics
    Process of Weakly Supervised Salient Object Detection
    YU Junwei, GUO Yuansen, ZHANG Zihao, MU Yashuang
    Computer Engineering and Applications    2024, 60 (10): 1-15.   DOI: 10.3778/j.issn.1002-8331.2308-0206
    Abstract193)      PDF(pc) (6029KB)(263)       Save
    Salient object detection aims to accurately detect and locate the most attention-grabbing objects or regions in images or videos, facilitating better object recognition and scene analysis. Despite the effectiveness of fully supervised saliency detection methods, acquiring large pixel-level annotated datasets is challenging and costly. Weakly supervised detection methods utilize relatively easy-to-obtain image-level labels or noisy weak labels to train models, demonstrating good performance in practical applications. This paper comprehensively compares the mainstream methods and application scenarios of fully supervised and weakly supervised saliency detection methods, and then analyzes the data annotation methods using weak labels and their impact on salient object detection. The latest research progress in salient object detection under weakly supervised conditions is reviewed, and the performance of various weakly supervised methods is compared on several public datasets. Finally, the potential applications of weakly supervised saliency detection methods in special fields such as agriculture, medicine and military are discussed, highlighting the existing challenges and future development trends in this research area.
    Reference | Related Articles | Metrics
    Improved YOLOv8 Method for Anomaly Behavior Detection with Multi-Scale Fusion and FMB
    SHI Yangyu, ZUO Jing, XIE Chengjie, ZHENG Diwen, LU Shuhua
    Computer Engineering and Applications    2024, 60 (9): 101-110.   DOI: 10.3778/j.issn.1002-8331.2401-0240
    Abstract191)      PDF(pc) (7946KB)(235)       Save
    To resolve the problems of anomaly behavior detection including multi-scale variations, miss and false detection, and complex background interference, a method is proposed by incorporating the fusion of multi-scale features and fast multi-cross block (FMB) for anomaly behavior detection. Based on YOLOv8 as the baseline network, a FMB has been designed in the backbone to enhance context information awareness and reduce network parameters. Meanwhile, a spatial-progressive convolution pooling (S-PCP) module has been proposed to achieve multi-scale information fusion, thereby reducing the issues of miss and false detection caused by scale differences and improving detection accuracy. A SimAM attention mechanism has been introduced in the neck to suppress complex background interference and improve object detection performance. And WIoU has been used to balance the penalty force on anchor boxes, enhancing the model’s generalization performance. The proposed method has been extensively validated on the UCSD-Ped1 and UCSD-Ped2 datasets, and its generalization has been tested on the OPIXray dataset. The results indicate that the proposed method with fewer parameters achieves different improvements in anomaly behavior recognition accuracy compared to many advanced detection algorithms, demonstrating an excellent detection method for pedestrian anomaly behavior.
    Reference | Related Articles | Metrics