Most Download articles

    Published in last 1 year| In last 2 years| In last 3 years| All| Most Downloaded in Recent Month | Most Downloaded in Recent Year|

    Published in last 1 year
    Please wait a minute...
    For Selected: Toggle Thumbnails
    Research on Intelligent Question Answering System Based on Large Language Model
    REN Haiyu, LIU Jianping, WANG Jian, GU Xunxun, CHEN Xi, ZHANG Yue, ZHAO Changxu
    Computer Engineering and Applications    2025, 61 (7): 1-24.   DOI: 10.3778/j.issn.1002-8331.2409-0300
    Abstract347)      PDF(pc) (1720KB)(393)       Save
    Intelligent question answering is a core subfield in natural language processing, aiming at systems that understand and answer natural language questions posed by users. Traditional question answering systems usually rely on predefined rules and limited corpora and are unable to handle complex multi-round dialogues. Large language models are natural language processing models based on deep learning technology, with billions or even hundreds of billions of parameters. They can not only understand and generate natural language but also significantly improve the accuracy and efficiency of question answering systems, promoting the development of intelligent question answering technology. In recent years, intelligent question answering based on large model technology has gradually become a research hotspot, but a systematic review in this field is still relatively lacking. Therefore, this article conducts a systematic review of intelligent question answering systems based on large models. Firstly, it introduces the basic concepts of question answering systems, datasets, and their evaluation metrics. Secondly, it presents question answering systems based on large models, including those based on prompt learning, knowledge graphs, retrieval-augmented generation, and intelligent agents, as well as the technical route of fine-tuning in question answering tasks, and compares the advantages, disadvantages, and application scenarios of the five methods in question answering systems. Finally, it summarizes the current research challenges and future development trends of question answering systems based on large language models.
    Reference | Related Articles | Metrics
    Research Progress on Designing Lightweight Deep Convolutional Neural Networks
    ZHOU Zhifei, LI Hua, FENG Yixiong, LU Jianguang, QIAN Songrong, LI Shaobo
    Computer Engineering and Applications    2024, 60 (22): 1-17.   DOI: 10.3778/j.issn.1002-8331.2404-0372
    Abstract320)      PDF(pc) (6330KB)(382)       Save
    Lightweight design is a popular paradigm to address the dependence of deep convolutional neural network (DCNN) on device performance and hardware resources, and the purpose of lightweighting is to increase the computational speed and reduce the memory footprint without sacrificing the network performance. An overview of lightweight design approaches for DCNNs is presented, focusing on a review of the research progress in recent years, including two major lightweighting strategies, namely, system design and model compression, as well as an in-depth comparison of the innovativeness, strengths and limitations of these two types of approaches, and an exploration of the underlying framework that supports the lightweighting model. In addition, scenarios in which lightweight networks have been successfully applied are described, and predictions are made for the future development trend of DCNN lightweighting, aiming to provide useful insights and references for the research on lightweight deep convolutional neural networks.
    Reference | Related Articles | Metrics
    Research Progress on Recommendation Algorithms with Knowledge Graph Visualization Analysis
    LIN Suqing, LUO Dingnan, ZHANG Shuhua
    Computer Engineering and Applications    2024, 60 (21): 1-17.   DOI: 10.3778/j.issn.1002-8331.2312-0032
    Abstract303)      PDF(pc) (1215KB)(341)       Save
    The application and proliferation of internet technology has caused an exponential growth in data, enhancing the complexity of information retrieval from massive datasets. Recommendation algorithms have attracted significant attention for alleviating information overload, with relevant research findings continually emerging. 4?773 Chinese and 4?531 English publications from 2012 to 2024 have been sourced from China National Knowledge Infrastructure (CNKI) and the Web of Science (WOS) core collection. Visualization tools CiteSpace and VOSviewer have been utilized to generate basic information and keyword co-occurrence graphs for literatures. Core technology keywords, including knowledge graph, graph neural network, and deep learning, have been extracted through graph analysis, and the corresponding representative recommendation algorithms have been selected. The core mechanisms and the underlying principles of the algorithms have been visually presented through charts, focusing on the limitations and challenges of existing research, as well as targeted solutions. Knowledge architecture diagrams have been developed for the algorithms associated with each core technology keyword, following the challenge-solution-source literature framework. The visualization of recommendation principles has been effectively implemented.
    Reference | Related Articles | Metrics
    Review of Deep Learning Models for Image Classification Based on Convolutional Neural Networks
    LIU Hongda, SUN Xuhui, LI Yibin, HAN Lin, ZHANG Yu
    Computer Engineering and Applications    2025, 61 (11): 1-21.   DOI: 10.3778/j.issn.1002-8331.2411-0196
    Abstract284)      PDF(pc) (1675KB)(329)       Save
    Using neural network model for classification has always been a very important research direction. With the development of deep learning technology, the requirement for neural network model is getting higher and higher. At the same time, high recognition rate, the number of parameters and training time of the model are also highly required. Convolutional neural networks have always been the mainstream method for image classification in deep learning. This paper mainly introduces the development history of convolutional neural networks for classification model, and analyzes the construction ideas of each model at different stages. Secondly, the paper reviews relevant examples of Transformer combined with convolutional neural networks as well as the application of each model in other fields. Finally, the possible development directions of convolutional neural networks are discussed.
    Reference | Related Articles | Metrics
    Research Advance of Crack Detection for Infrastructure Surfaces Based on Deep Learning
    HU Xiangkun, LI Hua, FENG Yixiong, QIAN Songrong, LI Jian, LI Shaobo
    Computer Engineering and Applications    2025, 61 (1): 1-23.   DOI: 10.3778/j.issn.1002-8331.2407-0407
    Abstract310)      PDF(pc) (9136KB)(324)       Save
    Civil infrastructure is prone to changes in physical or performance after long-term use, and causing certain damage to the function and service safety. So it is essential to monitor structure healthy of such facilities. Crack detection is an extremely important part of structure healthy monitoring. Timely detection and identification of such damage can effectively avoid severe accidents. Crack detection methods based on computer vision are simple, fast and accurate, and are widely used for surface crack detection in civil infrastructures. This paper reviews crack detection methods for infrastructure surfaces based on deep learning from three different detection directions: image classification, object detection, and semantic segmentation. And common data collection methods and commonly used public crack datasets are summarized. Finally, the difficulties and challenges of deep learning-based surface crack detection methods for infrastructures are discussed, and possible future development directions are envisioned.
    Reference | Related Articles | Metrics
    Review of Multi-Modal Driver Emotion Recognition
    ZHOU Xinying, LI Leixiao, LIN Hao, ZHANG Hucheng
    Computer Engineering and Applications    2025, 61 (10): 1-18.   DOI: 10.3778/j.issn.1002-8331.2410-0153
    Abstract232)      PDF(pc) (1630KB)(316)       Save
    Accurately identifying driver emotions can effectively prevent potential dangerous driving behaviors and reduce the occurrence of traffic accidents. It is an important technology to improve road safety and driving experience. With the progress of artificial intelligence and multi-modal data processing technology, emotion recognition technology has gradually developed from a single-modal approach to a multi-modal approach. This paper reviews the current research progress of multi-modal driver emotion recognition, and focuses on the recognition process of facial expression, voice signal, physiological signal and vehicle behavior. The key steps include data preprocessing, feature extraction and multi-modal fusion. By analyzing the existing research, the advantages and disadvantages of different methods are summarized, and several driver emotion-related datasets are introduced. Finally, combined with the current research challenges, five research directions in the field of multi-modal driver emotion recognition in the future are proposed.
    Reference | Related Articles | Metrics
    Review of Research on Artificial Intelligence in Traditional Chinese Medicine Diagnosis and Treatment
    SU Youli, HU Xuanyu, MA Shijie, ZHANG Yuning, Abudukelimu Abulizi, Halidanmu Abudukelimu
    Computer Engineering and Applications    2024, 60 (16): 1-18.   DOI: 10.3778/j.issn.1002-8331.2312-0400
    Abstract330)      PDF(pc) (6171KB)(305)       Save
    The field of traditional Chinese medicine (TCM) diagnosis and treatment is gradually moving towards standardization, objectification, modernization, and intelligence. In this process, the integration of artificial intelligence (AI) has greatly propelled the advancement of TCM diagnosis and treatment, scientific research, and TCM inheritance. The review starts from the current research status of AI in TCM, combs through the application and development of AI in TCM in three stages from expert system and rule engines, traditional machine learning algorithm to deep learning, and then summarizes the knowledge management tools and large language models of TCM in recent years. Finally, this paper analyzes the multiple challenges of data fairness, multimodal data understanding, model robustness, personalized medicine, and interpretability that exist at this stage of AI in TCM. To address these challenges, it is necessary to continuously explore and propose possible solutions to promote the in-depth development of intelligent TCM diagnosis and treatment, thus better meeting the health needs of people.
    Reference | Related Articles | Metrics
    Improved YOLOv11n Small Object Detection Algorithm in UAV View
    LI Bin, LI Shenglin
    Computer Engineering and Applications    2025, 61 (7): 96-104.   DOI: 10.3778/j.issn.1002-8331.2411-0072
    Abstract296)      PDF(pc) (1241KB)(290)       Save
    In order to effectively deal with the challenges of complex background, dense target, target miniaturization and mobile terminal deployment faced by small target detection in UAV aerial photography, the YOLOv11n model is improved. Firstly, RFCBAMConv module is used to improve C3k2, which enhances the ability of feature extraction. Then, the dilated feature pyramid convolution (DFPC) module is designed to replace the original SPPF layer. Through multi-scale dilated convolution, the extraction of small target detail features of UAV is strengthened. Secondly, a new feature pyramid structure is proposed, and a feature map output of 160×160 size is added to the P2 layer to extract the feature information of small targets. This method replaces the traditional practice of adding P2 small target detection head. The CSPOK module and ContextGuidedBlock_Down (CGBD) convolution are introduced, which significantly improves the extraction ability of global features and the fusion ability of multi-scale features. Finally, the dynamic detection head (DyHead) is used to replace the original detection head, which improves the target detection accuracy of the model. The experimental results show that the mAP@0.5 and mAP@0.5:0.95 indicators of the improved model on the VisDrone dataset are increased by 0.071 and 0.049, respectively. In addition, the generalization experiments on AI-TOD and SODA-A datasets also show that the improved model achieves 0.055 and 0.048 improvement in mAP@0.5, respectively, which fully verifies the effectiveness and universality of the model.
    Reference | Related Articles | Metrics
    Comprehensive Review of Large Language Model Fine-Tuning
    ZHANG Qintong, WANG Yuchao, WANG Hexi, WANG Junxin, CHEN Hai
    Computer Engineering and Applications    2024, 60 (17): 17-33.   DOI: 10.3778/j.issn.1002-8331.2312-0035
    Abstract261)      PDF(pc) (6335KB)(282)       Save
    The rise of large-scale language models signifies a new milestone in the field of deep learning, with fine-tuning techniques playing a crucial role in optimizing model performance. This paper provides a comprehensive overview of fine-tuning techniques for large-scale language models. It reviews the development stages of language models, including statistical language models, neural network language models, pre-trained language models, and large language models. The basic concepts of fine-tuning are explored, covering classic fine-tuning, efficient parameter fine-tuning, prompt tuning, and reinforcement learning fine-tuning. The paper delves into the principles and development of each fine-tuning technique, offering a comparative analysis across these four major categories. In conclusion, the paper summarizes the current state of research on fine-tuning techniques and underscores the potential research value in this domain, providing insights into future directions of development.
    Reference | Related Articles | Metrics
    Research Progress on Multi-Agent Deep Reinforcement Learning and Scalability
    LIU Yanfei, LI Chao, WANG Zhong, WANG Jieling
    Computer Engineering and Applications    2025, 61 (4): 1-24.   DOI: 10.3778/j.issn.1002-8331.2407-0034
    Abstract241)      PDF(pc) (2161KB)(281)       Save
    Multi-agent deep reinforcement learning has shown great potential in solving agent collaboration, competition, and communication problems in recent years. However, as its application expands across more domains, scalability has become a focal concern, which is an important problem from theoretical research to large-scale engineering applications. This paper reviews the reinforcement learning theory and typical algorithms of deep reinforcement learning, introduces three learning paradigms of multi-agent deep reinforcement learning and their representative algorithms, and briefly summarizes the current mainstream open-source experimental platforms. Then, this paper delves into the research progress on the scalability of the number and scenarios in multi-agent deep reinforcement learning, analyzes the main problems faced by each method and providing existing solutions. Finally, the application prospect and development trend of multi-agent deep reinforcement learning are prospected, providing references and inspiration to further advance research in this field.
    Reference | Related Articles | Metrics
    Review of YOLO Methods for Universal Object Detection
    MI Zeng, LIAN Zhe
    Computer Engineering and Applications    2024, 60 (21): 38-54.   DOI: 10.3778/j.issn.1002-8331.2404-0130
    Abstract292)      PDF(pc) (5798KB)(278)       Save
    As the first single-stage object detection algorithm in the era of deep learning, YOLO has sparked a wave of enthusiasm in the field of computer vision with its powerful and unique paradigm, and has become a milestone achievement in object detection algorithms. It is still a typical algorithm that achieves the best balance between speed and accuracy, and is widely used in industrial fields such as autonomous driving and intelligent vision systems. In the past eight years, driven by deep learning technology, YOLO methods have developed rapidly and have profound impact on the entire field of object detection. This paper conducts an in-depth investigation of the YOLO method related work from the perspective of technological evolution, comprehensively summarizing the innovation and contributions of each iteration from the initial YOLO v1 to the latest YOLO v9 and YOLO v10. Based on the significant technological improvements at different time points, the YOLO method is divided into four parts: early basic YOLO, standard version YOLO, standard improvement YOLO, and unique improvement YOLO. The unique perspectives of the improvement methods in each period are introduced in detail. In addition, the dataset and indicators for evaluating the YOLO method are summarized, and detailed experimental results of different versions of YOLO and different models of the same version of YOLO are collected. The development and changes of YOLO are summarized from both macro and micro levels. Through analysis, the differences and inherent connections in the development framework, backbone network architecture, and prior box usage among different versions of YOLO are revealed, emphasizing the importance of balancing speed and accuracy in YOLO. Finally, through systematic review, the future development trends of YOLO method is summarized.
    Reference | Related Articles | Metrics
    Review of Text Classification Methods Based on Graph Neural Networks
    SU Yilei, LI Weijun, LIU Xueyang, DING Jianping, LIU Shixia, LI Haonan, LI Guanfeng
    Computer Engineering and Applications    2024, 60 (19): 1-17.   DOI: 10.3778/j.issn.1002-8331.2403-0142
    Abstract247)      PDF(pc) (3425KB)(272)       Save
    Text classification is an important task in the field of natural language processing, aiming to assign given text data to a predefined set of categories. Traditional text classification methods can only handle data in Euclidean space and cannot process non-Euclidean data such as graphs. For text data with graph structure, it is not directly processable and cannot capture the non-Euclidean structure in the graph. Therefore, how to apply graph neural networks to text classification tasks is one of the current research hotspots. This paper reviews the text classification methods based on graph neural networks. Firstly, it outlines the traditional text classification methods based on machine learning and deep learning, and summarizes the background and principles of graph convolutional neural networks. Secondly, it elaborates on the text classification methods based on graph neural networks according to different types of graph networks, and conducts an in-depth analysis of the application of graph neural network models in text classification. Then, it compares the current text classification models based on graph neural networks through comparative experiments and discusses the classification performance of the models. Finally, it proposes future research directions to further promote the development of this field.
    Reference | Related Articles | Metrics
    Status and Challenges of Large Language Models Applications in Vertical Domains
    JI Xinmeng, ZAN Hongying, CUI Tingting, ZHANG Kunli
    Computer Engineering and Applications    2025, 61 (12): 1-11.   DOI: 10.3778/j.issn.1002-8331.2409-0181
    Abstract243)      PDF(pc) (839KB)(267)       Save
    In recent years, large language models, exemplified by ChatGPT, have garnered significant attention across various fields and demonstrated outstanding performance, fueling a new wave of advancements in artificial intelligence technology. At present, there are over a hundred domestic large language models, spanning multiple industry sectors, with their applications continuously expanding. To better address the development of large language models in natural language processing and their impact on both general tasks and specialized domain applications, this paper reviews the evolution of natural language processing and large language models. It provides an overview of current large model technologies and their applications in vertical domains such as healthcare, law, and finance. Furthermore, it analyzes the challenges faced by large models during deployment, such as limitations in capabilities and collaboration difficulties. Lastly, the paper discusses the future research directions aimed at addressing these issues and enhancing the practical application of large language models.
    Reference | Related Articles | Metrics
    Review of Application of Visual Foundation Model SAM in Medical Image Segmentation
    SUN Xing, CAI Xiaohong, LI Ming, ZHANG Shuai, MA Jingang
    Computer Engineering and Applications    2024, 60 (17): 1-16.   DOI: 10.3778/j.issn.1002-8331.2401-0136
    Abstract290)      PDF(pc) (7912KB)(262)       Save
    With the continuous development of foundation models technology, visual foundation model represented by the segment anything model (SAM) has made significant breakthroughs in the field of image segmentation. SAM, driven by prompts, accomplishes a series of downstream segmentation tasks, aiming to address all image segmentation issues comprehensively. Therefore, the application of SAM in medical image segmentation is of great significance, as its generalization performance can adapt to various medical images, providing healthcare professionals with a more comprehensive understanding of anatomical structures and pathological information. This paper introduces commonly used datasets for image segmentation, provides detailed explanations of SAM’s network architecture and generalization capabilities. It focuses on a thorough analysis of SAM’s application in five major categories of medical images: whole-slide imaging, magnetic resonance imaging, computed tomography, ultrasound, and multimodal images. The review summarizes the strengths and weaknesses of SAM, along with corresponding improvement methods. Combining current challenges in the field of medical image segmentation, the paper discusses and anticipates future directions for SAM’s development.
    Reference | Related Articles | Metrics
    Research on Unmanned Aerial Vehicle Swarm Resilience Assessment and Reconfiguration Technology
    WEI Chenyue, HE Ming, HAN Wei, XU Xin, GAO Hong
    Computer Engineering and Applications    2024, 60 (15): 1-10.   DOI: 10.3778/j.issn.1002-8331.2401-0452
    Abstract215)      PDF(pc) (4418KB)(249)       Save
    Unmanned aircraft vehicle (UAV) swarm is often affected by perturbing factors such as terrain, wind, snow, rain and fog, and anti-aircraft strikes in practical applications, which leads to the decline of swarm performance and mission accomplishment capability. In order to effectively assess and improve the swarm anti-disturbance capability, an in-depth study is carried out in terms of UAV swarm resilience assessment indexes and resilience reconfiguration methods. Firstly, the current research status of UAV swarm resilience assessment indicators is sorted out and analyzed. Secondly, the research on UAV swarm resilience reconstruction methods is summarized in terms of predictive reconstruction and anti-disturbance reconstruction. To address the problems of incomplete assessment indexes and the inability of swarm adaptive reconfiguration under multi-task and multi-disturbance situations, multi-dimensional resilience assessment indexes and UAV swarm phase change reconfiguration methods are proposed respectively, which further take into account the impact of coverage, energy consumption and other factors on swarm performance, realize the adaptive phase change of different types of tasks and disturbance types, and significantly improve the swarm’s ability to cope with disturbances. Finally, it concludes and looks forward to the future development trend of UAV swarm elastic reconfiguration.
    Reference | Related Articles | Metrics
    Overview of Causal Learning Techniques and Applications
    LONG Xiangfu, LI Shaobo, ZHANG Yizong, YANG Lei, LI Chuanjiang
    Computer Engineering and Applications    2024, 60 (24): 1-19.   DOI: 10.3778/j.issn.1002-8331.2405-0407
    Abstract212)      PDF(pc) (6887KB)(242)       Save
    Machine learning is the core of artificial intelligence and data science, and is widely used in education, transportation and manufacturing. With the development of machine learning and the extension of application fields, the models have revealed some problems to be solved in terms of interpretability and fairness. Causal learning (CL), as a method combining causality and machine learning techniques, can enhance the interpretability of the model and solve the problems of fairness, and its research has gradually become a hot spot in the academic world. Therefore, based on the introduction of the relevant theoretical knowledge of CL, the techniques of causal explanation, causal supervised learning, causal fairness, and causal reinforcement learning are firstly analyzed and outlined in an all-round way according to the problems that can be solved by CL. Secondly, the applications of CL in the fields of medicine, agriculture and intelligent manufacturing are summarized from multiple perspectives. Finally, some open problems and challenges of CL are summarized, and future research directions are given, aiming to promote the continuous development of CL.
    Reference | Related Articles | Metrics
    Improved RT-DETR Algorithm for Aerial Small Object Detection
    LIU Siyuan, GAO Kai, YONG Longquan
    Computer Engineering and Applications    2025, 61 (4): 272-281.   DOI: 10.3778/j.issn.1002-8331.2407-0399
    Abstract184)      PDF(pc) (1975KB)(238)       Save
    Aiming to address the issue of missed and false detection of small objects in aerial photography images by existing object detection algorithms, an improved algorithm based on RT-DETR (real-time detection transformer) is proposed. Partial convolution (PConv) is introduced into the backbone network, and a PConvBlock structure is designed. Then, a BasicBlock-PConvBlock module composed of PConvBlocks replaces the original BasicBlock, effectively reducing the number of model parameters. The bidirectional feature pyramid network (BiFPN) structure is adopted to optimize the feature fusion module. The S2 feature is introduced to enhance the detection ability of small objects. The CARAFE upsampling operator is introduced to strengthen the fast fusion of multi-scale features. Experimental results show that the improved model has a 13.9% reduction in parameter number compared to the RT-DETR model, and the mAP0.5 and mAP0.5:0.95 indicators are improved by 2.4 and 1.9 percentage points, respectively on the VisDrone test set. On the TT100K and DOTA datasets, the improved model outperforms the RT-DETR algorithm. The improved model significantly enhances detection accuracy while maintaining a smaller parameter number and computational cost, meeting the real-time detection application requirements for drone aerial photography images.
    Reference | Related Articles | Metrics
    Research and Progress on Super-Resolution Reconstruction Methods for Terahertz Images
    JIANG Yuying, JIANG Mengdie, GE Hongyi, ZHANG Yuan, LI Guangming, CHEN Xinyu, WEN Xixi, CHEN Hao
    Computer Engineering and Applications    2024, 60 (18): 1-16.   DOI: 10.3778/j.issn.1002-8331.2401-0161
    Abstract211)      PDF(pc) (6043KB)(238)       Save
    Image super resolution is an important research topic in image processing field in recent decades, aiming to reconstruct high resolution image from low resolution image. It breaks through the limitation of manufacturing process and cost of sensor and optical device, and improves image resolution from the aspect of algorithm, which is a simple, efficient and low-cost method. As an emerging technology, Terahertz (THz) technology has been widely used in many fields. Due to the influence of THz diffraction and scattering, THz images will produce image blur and unclear texture details. More and more scholars are committed to developing super-resolution reconstruction methods for THz images. Based on the research of the literature related to THz technology and super-resolution reconstruction technology in recent years, this paper elaborates the three major reconstruction methods of THz images, focuses on the introduction of deep learning-based methods, and compares the reconstruction effects, advantages and disadvantages of various algorithms. The THz image quality assessment indexes and the commonly used datasets are reviewed, and the super-resolution reconstruction technology of THz image related applications are summarized. Finally, the future development trend of THz image super-resolution reconstruction technology is discussed.
    Reference | Related Articles | Metrics
    Improved Road Defect Detection Algorithm Based on YOLOv8
    WANG Xueqiu, GAO Huanbing, JIA Zemeng
    Computer Engineering and Applications    2024, 60 (17): 179-190.   DOI: 10.3778/j.issn.1002-8331.2404-0288
    Abstract261)      PDF(pc) (5995KB)(230)       Save
    Various defects can emerge on the road surface after prolonged use. Failing to promptly detect and repair these defects can significantly reduce the road’s lifespan and jeopardize driving safety. Consequently, real-time detection of road defects assumes paramount importance. However, traditional detection methods suffer from sluggish speed and hefty cost requirements. Hence, to tackle these challenges, a novel road detection algorithm called DML-YOLO is proposed, which builds upon the YOLOv8 framework. This algorithm integrates the MultiPath coordinate attention (MPCA) mechanism into the backbone network to enhance feature extraction. Additionally, the C2f-MPDC module is introduced to dynamically adjust the receptive field and improve detection capabilities. Furthermore, the network’s neck structure is redesigned, introducing a novel diversity feature pyramid network (DFPN) that reduces model size and fuses low-level feature maps to extract rich, detailed information and elevate the success rate of detecting small targets. Moreover, a lightweight shared convolutional detection head (LSCD head) is meticulously designed to enhance detection efficiency while reducing model size. Ultimately, extensive experimental results demonstrate that DML-YOLO achieves remarkable average detection precision, with mAP@0.5 scores of 89.6% on the RDD2022 dataset and 73.6% on the VOC2007 dataset, surpassing other models tested. Additionally, compared to the YOLOv8 model, DML-YOLO boasts a reduction of 32.37% in parameter count and 14.49% in computational workload, making it highly suitable for deployment in resource-constrained computing environments like embedded systems and mobile devices.
    Reference | Related Articles | Metrics
    Review of Development of Visual-Inertial Joint Calibration
    ZHAO Junyang, LYU Shenhua, LI Yongxu, ZHU Huixin, ZHANG Kefan
    Computer Engineering and Applications    2025, 61 (8): 1-16.   DOI: 10.3778/j.issn.1002-8331.2409-0330
    Abstract191)      PDF(pc) (1197KB)(229)       Save
    The joint use of cameras and IMU (inertial measurement unit) can fully leverage the complementary advantages of two sensors, enabling data fusion and mutual calibration. In recent years, a variety of intelligent joint calibration methods have emerged, however, there is a lack of unified summarization and analysis. Therefore, the visual-inertial joint calibration methods are classified and sorted in a unified way to analyze the application characteristics and limitations of various approaches, and provide a better choice foundation for the application or research of camera and IMU joint calibration methods. Firstly, this paper introduces the calibration parameters and principles for both the camera and IMU, discussing these from temporal and spatial perspectives. Secondly, it classifies and comparatively analyzes online and offline temporal calibration methods. From a spatial perspective, the paper categorizes calibration methods based on the distinct principles of IMU and camera calibration into four types: optimization-based calibration, decoupled model-based calibration, filtering-based calibration, and machine learning-based calibration, while evaluating the advantages and characteristics of each approach. Finally, to summarize the entire paper, it proposes the future development trends of joint calibration: spatiotemporal unified calibration, a greater variety of calibration toolkits, the expansion of machine learning applications, and multi-sensor joint calibration, among others.
    Reference | Related Articles | Metrics
    Review of Application of BEV Perceptual Learning in Autonomous Driving
    HUANG Deqi, HUANG Haifeng, HUANG Deyi, LIU Zhenhang
    Computer Engineering and Applications    2025, 61 (6): 1-21.   DOI: 10.3778/j.issn.1002-8331.2407-0501
    Abstract212)      PDF(pc) (2079KB)(227)       Save
    As the types of sensors used as acquisition inputs in the autonomous driving perception module continue to develop, it becomes more and more difficult to represent the multi-modal data uniformly. BEV perception learning in the automatic driving perception task module can make multi-modal data unified integration into a feature space, which has better development potential compared with other perception learning models. The reasons for the good development potential of BEV perception model are summarized from five aspects: research significance, spatial deployment, preparation work, algorithm development, and evaluation index. The BEV perception model can be summarized into four series from a framework perspective: Lift-Splat-Lss series, IPM reverse perspective conversion, MLP view conversion and Transformer view conversion. The input data can be summarized into two categories: the first type of pure image feature input includes monocular camera input and multi-camera input; the second type of fusion data input is not only the simple data fusion of point cloud data and image features, but also the knowledge distillation fusion guided or supervised by point cloud data and the fusion of height segmentation by guided slice. It provides an overview of the application of four kinds of automatic driving tasks in BEV perception model, such as multi-target tracking, map segmentation, lane detection and 3D target detection, and summarizes the shortcomings of the four series of current BEV perception learning frameworks.
    Reference | Related Articles | Metrics
    Review on Deep Learning-Based 2D Single-Person Pose Estimation
    SU Yanyan, QIU Zhiliang, LI Guo, LU Shenglian, CHEN Ming
    Computer Engineering and Applications    2024, 60 (21): 18-37.   DOI: 10.3778/j.issn.1002-8331.2403-0152
    Abstract141)      PDF(pc) (7680KB)(226)       Save
    Human pose estimation is a key technology in the field of computer vision that identifies human postures by detecting body keypoints. With the rapid advancement of deep learning, it has become the dominant approach in human pose estimation, achieving significant progress. This paper reviews single-person pose estimation research based on deep learning, examining the issue from four perspectives: data preprocessing, network architecture design, supervised learning methods, and post-processing techniques. It also explores new representations of keypoints and the application of Transformer models in this area. Additionally, the paper introduces common datasets and performance metrics, and delves into the current challenges and future directions in the field of single-person pose estimation.
    Reference | Related Articles | Metrics
    Review of Lung CT Image Lesion Region Segmentation Based on Deep Learning
    LI Xiaotong, MA Sufen, SHENG Hui, WEI Guohui, LI Xintong
    Computer Engineering and Applications    2025, 61 (4): 25-42.   DOI: 10.3778/j.issn.1002-8331.2403-0315
    Abstract212)      PDF(pc) (4394KB)(223)       Save
    Lung cancer poses a serious threat to people’s lives and health. The morphology of lesion areas in lung CT images is complex and diverse, and achieving high-precision segmentation of lesion areas in lung CT images has become a highly challenging key issue in the field of computer-aided diagnosis. The segmentation of lung lesion regions based on deep learning not only helps doctors diagnose early lung cancer quickly and accurately, but also has important clinical value for the treatment of lung cancer. In order to conduct in-depth research on lung lesion segmentation techniques, common datasets and evaluation indicators are introduced. The deep learning lung lesion regions segmentation models are reviewed in three aspects:segmentation model based on convolutional neural network, segmentation model based on U-Net model, and segmentation model based on generative adversarial network. The innovative points of domestic and foreign research over the past 5 years are summarized through specific experiments. The segmentation performance of various models is compared and analyzed. The advantages and disadvantages of various models are summarized, and the development direction in this field is discussed.
    Reference | Related Articles | Metrics
    Improved Target Detection Algorithm for UAV Images with RT-DETR
    JIANG Maoxiang, SI Zhanjun, WANG Xiaozhe
    Computer Engineering and Applications    2025, 61 (1): 98-108.   DOI: 10.3778/j.issn.1002-8331.2405-0331
    Abstract216)      PDF(pc) (5878KB)(220)       Save
    This paper proposes an improved RT-DETR algorithm for unmanned aerial vehicle (UAV) target detection in light and small-sized UAV image targets. Addressing issues such as low detection accuracy due to the flexible and diverse nature of targets and complex and variable environments, the proposed method enhances the feature extraction capability of the detection model by integrating lightweight SimAM attention and inverted residual modules into the ResNet-r18 backbone network. Furthermore, a cascaded group attention mechanism is employed to optimize the inverted residual modules and feature interaction modules, improving feature selection capability and achieving refined acquisition of target detection information. Additionally, a 160×160 detection layer is introduced in the neck network to enhance the perception capability of small targets during the feature fusion stage. Finally, the experimental results based on the VisDrone2019 dataset show that the improved model has lower number of parameters and higher detection accuracy. Further experiments on the Alver_Lab_Ulastirma and HIT-UAV datasets validate the effectiveness and robustness of the proposed improvements.
    Reference | Related Articles | Metrics
    Survey of Link Prediction in Knowledge Graph Embedding Methods
    LIU Haichao, LIU Lin, WANG Hailong, ZHAO Weiwei, LIU Jing
    Computer Engineering and Applications    2025, 61 (8): 17-34.   DOI: 10.3778/j.issn.1002-8331.2407-0158
    Abstract176)      PDF(pc) (1109KB)(216)       Save
    Knowledge graphs often suffer from issues such as missing entities and relationships. Knowledge graph completion, which addresses these deficiencies, has garnered significant attention from researchers. Link prediction based on knowledge graph embedding, as an important research direction for knowledge graph completion, can predict missing entities or relationships in the knowledge graph, thereby enhancing its completeness. Firstly, this paper expounds the research background, significance and definition of link prediction in knowledge graph. Secondly, based on the number of entities in the embedding unit, the link prediction models for knowledge graph embedding are divided into two-entity embedding link prediction models and multi-entity embedding link prediction models. The idea of model construction is elaborated, the experimental results are analyzed, and the advantages and disadvantages of various models are summarized. Finally, the current status and future research directions of knowledge graph embedded link prediction are prospected to provide inspiration and guidance for subsequent development.
    Reference | Related Articles | Metrics
    Small Object Detection Algorithm for UAV Images Based on Improved YOLOv8
    HOU Ying, WU Yan, KOU Xurui, HUANG Jiachao, TUO Jindou, WANG Yuqi, HUANG Xiaojun
    Computer Engineering and Applications    2025, 61 (11): 83-92.   DOI: 10.3778/j.issn.1002-8331.2411-0214
    Abstract144)      PDF(pc) (1986KB)(215)       Save
    Unmanned aerial vehicle (UAV) images have a large number of densely distributed small targets, which easily cause the problems of small target missed detection and false detection. Therefore, an improved YOLOv8 small target detection algorithm for UAV images is proposed. Firstly, by utilizing high-resolution shallow feature information with smaller receptive fields and finer spatial information features, a small object detection head is added and four feature extraction heads are used to improve the small object detection rate. Secondly, a small object detection module group with ConvSPD convolution module and BiFormer attention enhancement module is designed to improve the YOLOv8 backbone network, which effectively enhances the ability to capture shallow detail feature information of small objects. Subsequently, to meet the hardware deployment requirements of the model, a reparameterizable Rep-PAN model is adopted to optimize the Neck network. Finally, in order to improve the positioning accuracy, the Focaler-CIoU loss function with target size adaptive penalty factor is adopted in the Head network to optimize the regression positioning loss. On the VisDrone-2019 dataset, the improved algorithm obtains 51.2% average detection accuracy and is 10.9 percentage point higher than YOLOv8. In addition, its detection frame rate achieves 63.7 FPS, and it has good real-time performance.
    Reference | Related Articles | Metrics
    Survey on Lane Line Detection Techniques for Classifying Semantic Information Processing Modalities
    HONG Shuying, ZHANG Donglin
    Computer Engineering and Applications    2025, 61 (5): 1-17.   DOI: 10.3778/j.issn.1002-8331.2406-0160
    Abstract193)      PDF(pc) (2981KB)(214)       Save
    With the rapid development of autonomous driving technology, lane line detection, as its key component, has attracted widespread attention and shown great potential for application in intelligent transportation systems. However, traditional lane line detection techniques usually struggle to provide satisfactory recognition accuracy when dealing with complex environmental challenges. This paper reviews the development of lane detection technology and systematically sorts out 84 advanced algorithms, and innovatively divides them into four categories based on semantic processing: semantic segmentation assistance, semantic information fusion, semantic information enhancement, and semantic relationship mode-
    ling. By deeply analyzing the technical characteristics and advantages of these algorithms, the main limitations of current lane line detection technology are revealed. Finally, the future development direction of lane line detection technology is put forward, especially in the utilization of semantic information, and the potential research direction is pointed out.
    Reference | Related Articles | Metrics
    LOL-YOLO:Low-Light Object Detection Incorporating Multiple Attention Mechanisms
    JIANG Changjiang, HE Xuying, XIANG Jie
    Computer Engineering and Applications    2024, 60 (24): 177-187.   DOI: 10.3778/j.issn.1002-8331.2406-0424
    Abstract208)      PDF(pc) (7039KB)(214)       Save
    Addressing the challenges in low-illumination target detection, such as blurry night scenes, indistinct boundaries, and pronounced brightness disparities, this paper introduces LOL-YOLO (low-light YOLO), a detection method based on dynamic feature fusion. A self-correcting illumination module is incorporated to enhance low-light image quality and counteract target obscurity under low illumination. A dynamic feature extraction module is proposed, which leverages an attention mechanism combining large convolutional kernels with deformable convolutions, enabling extensive and agile contextual information capture. Finally, a dynamic detection head is devised to augment perception of varying scales, spatial positions, and tasks, thereby refining detection accuracy and robustness. Experimental validation using the ExDark, DarkFace, and NPD (nighttime pedestrian detection) datasets demonstrate significant accuracy improvements over prevalent algorithms, confirming the effectiveness of the proposed method.
    Reference | Related Articles | Metrics
    Algorithmic Research Overview on Graph Coloring Problems
    SONG Jiahuan, WANG Xiaofeng, HU Simin, JIA Jingwei, YAN Dong
    Computer Engineering and Applications    2024, 60 (18): 66-77.   DOI: 10.3778/j.issn.1002-8331.2403-0434
    Abstract276)      PDF(pc) (4612KB)(205)       Save
    The graph coloring problem (GCP) is a classic combinatorial optimization problem that has been widely applied in various fields such as mathematics, computer science, and biological science. Due to the NP hard nature of graph coloring problems, there is currently no precise algorithm in polynomial time to solve the problem. In order to provide an efficient algorithm for solving this problem, it is necessary to review the existing algorithms. It mainly divided into intelligent optimization algorithms, heuristic algorithms, reinforcement learning algorithms, etc., comparative analysis is carried out from the aspects of algorithm principles, improvement ideas, performance and accuracy, summarizing the advantages and disadvantages of algorithms, and pointing out the research direction and algorithm design path of GCP, which has guiding significance for the research of related problems.
    Reference | Related Articles | Metrics
    Survey of Retrieval-Augmented Generation Based on Large Language Models
    LIU Xueying, YUN Jing, LI Bo, SHI Xiaoguo, ZHANG Yuying
    Computer Engineering and Applications    2025, 61 (13): 1-25.   DOI: 10.3778/j.issn.1002-8331.2410-0088
    Abstract183)      PDF(pc) (1412KB)(197)       Save
    Artificial intelligence agents provide efficient solutions in complex tasks, which have recently gained attention in industry. As one of the paradigms of artificial intelligence agents, retrieval-augmented generation (RAG), which aims to enhance the quality of generated responses by combining information retrieval and content generation techniques, has gradually become the focus of research. According to the studies on retrieval enhancement generation methods at home and abroad, the basic concept and workflow of RAG are elaborated, the current state of the technology is summarized, the advantages and disadvantages of the existing RAG technology are analyzed, and the existing evaluation indexes, datasets and benchmarks are sorted out. Finally, challenges faced by RAG technology in future application scenarios are discussed and the future development direction of RAG technology is envisioned.
    Reference | Related Articles | Metrics
    Embedded Detection Method of Coating Surface Defects Based on YOLOv4-tiny-SR
    ZHAO Hui, HOU Xutao, SONG Long, XU Ke, SHA Jianjun, CHEN Zongyang
    Computer Engineering and Applications    2025, 61 (8): 239-249.   DOI: 10.3778/j.issn.1002-8331.2312-0180
    Abstract134)      PDF(pc) (4597KB)(194)       Save
    A detection method for coating surface defects is proposed to solve the problems of low detection accuracy, slow speed and high requirements for hardware configuration in the process of coating surface defect embedded detection. The YOLOv4-tiny-SR uses a new model block DSRBlock. The local structure of the model block can greatly reduce memory consumption and increase detection speed while ensuring detection accuracy. A geometric average clustering method is proposed, which converts the update method of cluster centers from arithmetic average to geometric average to avoid the deviation of cluster centers to the large target frame. At the same time, for difficult-to-detect samples, a hard sample loss function is designed to increase the learning intensity of the network and improve the detection effect. The comparison experiment results based on the measured data of coating surface defects show that the method in this paper has obvious advantages compared with other methods in terms of parameter quantity, model size, detection speed and accuracy. Compared with the current mainstream YOLOv4-tiny, the parameter amount is reduced by 51.82%, the model size is reduced by 46%, the speed is increased by 39.47%, and the accuracy is also increased by 1.25?percentage points. The method in this paper has faster detection speed, higher detection accuracy, and less memory consumption, it has high practical value for real-time detection of surface defects on embedded devices for industrial applications, and can be popularized and applied to related fields.
    Reference | Related Articles | Metrics
    Lightweight Detection of Ceramic Tile Surface Defects on Improved YOLOv8
    YU Songsen, XUE Guopeng, HE Huang, ZHAO Gui, WEN Huosheng
    Computer Engineering and Applications    2024, 60 (18): 88-102.   DOI: 10.3778/j.issn.1002-8331.2312-0155
    Abstract163)      PDF(pc) (8560KB)(194)       Save
    In terms of tile surface defect detection, under the premise of ensuring a certain detection speed, it is more difficult to detect small target defects, and the overall detection accuracy is still low. This paper proposes an improved tile surface defect detection method for YOLOv8. Firstly, data preprocessing is performed on the original large-format tile dataset, and tile data suitable for the input size of YOLOv8 is obtained through slicing operation to prevent tile defects from being lost in the process of scaling. Secondly, taking into account that there is a large proportion of small target defects on the tile surface, the structure of SPD-Conv is used instead of the traditional downsampling method, which can completely retain all the information in the channel dimension, so as to improve the detection ability of small target defects. Thirdly, the original C2f module in YOLOv8 is modified by adding the efficient channel attention (ECA) mechanism, designing the C2f_ECA module, and replacing it in the backbone network, so that the network can pay more attention to the defect information and reduce the interference of background information in the process of feature extraction  Fourthly, the tiny target detection head is added to detect after the second downsampling to improve the detection ability of YOLOv8 on tiny targets. The method is experimentally validated on the Tianchi tile defect detection dataset, and the improved model achieves 57.7%, 86.6%, and 60.6% on mAP50-95, mAP50, and mAP75, respectively, which are 9.4, 5, and 14.3 percentage points higher than the base network YOLOv8s, respectively. Meanwhile, there are higher accuracy and much lower complexity than YOLOv8m, which is a lightweight model and meets the needs of industrialization.
    Reference | Related Articles | Metrics
    Review of Collaborative Inference Methods for Edge Intelligence
    ZHAO Chanchan, LYU Fei, SHI Bao, YU Xiaomin, YANG Xingchen, YUE Xiaocan
    Computer Engineering and Applications    2025, 61 (3): 1-20.   DOI: 10.3778/j.issn.1002-8331.2406-0040
    Abstract152)      PDF(pc) (7788KB)(191)       Save
    With the development of edge intelligence, collaborative inference technology has made significant progress in enhancing the efficiency and performance of intelligent applications through collaboration among cloud, edge, and terminal devices. The performance metrics, application scenarios, and challenges of edge intelligence are outlined, introducing four inference paradigms under collaborative inference technology through the rating architecture of edge intelligence: end-to-end collaboration, edge-to-end collaboration, edge-to-edge collaboration, and cloud-edge-end collaboration. Based on the limitations and differences of application scenarios for collaborative inference technology, the advantages, limitations, principles, and optimization goals of collaborative inference technology in different inference paradigms are comprehensively analyzed and compared. The discussion delves into issues such as computational resource allocation, inference latency optimization, and throughput optimization solved by collaborative inference technology in different application scenarios. It also points out challenges in privacy security, communication service resource management, and collaborative training within edge intelligence. Future development trends and research directions are discussed, providing references and insights for research in this field.
    Reference | Related Articles | Metrics
    Survey on Prompt Learning
    CUI Jinman, LI Dongmei, TIAN Xuan, MENG Xianghao, YANG Yu, CUI Xiaohui
    Computer Engineering and Applications    2024, 60 (23): 1-27.   DOI: 10.3778/j.issn.1002-8331.2407-0436
    Abstract180)      PDF(pc) (9840KB)(189)       Save
    Fine-tuned pre-trained language models have achieved remarkable performance in various domain tasks. However, there is a significant gap in training data and objective function between pre-training and fine-tuning, which limits the effective adaptation of pre-trained language models to downstream tasks. Prompt learning has been proposed to bridge the gap in pre-training and fine-tuning, and can be well applied to few-shot or even zero-shot scenarios. The core idea of prompt learning is to wrap the original input with prompt template to convert the downstream task data into the form of natural language, and input it into the pre-trained models to output the prediction result, and then map the output to corresponding labels through the language verbalizer. This paper systematically combs the current approaches of prompt learning, and introduces its research progress from two stages of prompt template and language verbalizer construction according to the implementation steps of prompt learning. The prompt template based methods are subdivided into manually constructed, automatic constructed, introducing external knowledge to constructing prompt and thought prompting methods. The language verbalizer based methods are subdivided into manual verbalizer, search-based verbalizer, soft verbalizer and verbalizer with external knowledge introduced. In the following, the paper summarizes the main applications of prompt learning in the fields of natural language processing, computer vision and multimodal, and analyzes the related experiments of prompt learning. Finally, this paper summarizes the current situation and challenges in prompt learning, and prospects the future technological development of prompt learning.
    Reference | Related Articles | Metrics
    Improved YOLOv8 Urban Vehicle Target Detection Algorithm
    XU Degang, WANG Shuangchen, WANG Zaiqing, YIN Kedong
    Computer Engineering and Applications    2024, 60 (18): 136-146.   DOI: 10.3778/j.issn.1002-8331.2401-0277
    Abstract205)      PDF(pc) (6421KB)(184)       Save
    Aiming to address the challenges of missing detection, low precision, and weak generalization ability in urban vehicle target detection algorithms for complex traffic scenes, an enhanced YOLOv8 algorithm is proposed. Firstly, this paper replaces the C2f module in the backbone network with an improved GAM-C2f structure to strike a balance between computational efficiency and model accuracy. Secondly, a SPPFAPGC module is designed to prevent local feature loss caused by maximum pooling operations in the SPPF structure. This enhances the richness of the feature map and combines it with a small target detection head to strengthen distant small target vehicle detection capability while integrating local and global features effectively. Finally, to suppress harmful gradients generated by low-quality images, this paper utilizes WIOU loss function instead of CIoU for improved bounding box regression performance, faster convergence speed, and higher regression accuracy. Experimental results on street vehicle datasets demonstrate that compared to the benchmark model YOLOv8n, the improved algorithm achieves a 1.6 percentage points increase in mAP50 and a 2.0 percentage points increase in Recall respectively , the problem of poor detection performance for small-target vehicles in urban traffic scenes is effectively improved. Verification on VisDrone2019 dataset also shows improvements of 1.1 percentage points in mAP50 and 1.6 percentage points in Recall further confirming the superiority of the enhanced algorithm over others mainstream algorithms regarding accuracy and recall rate specifically tailored for urban vehicle detection tasks.
    Reference | Related Articles | Metrics
    Review of Research Progress in Object Detection Driven by Deep Learning
    SHAN Xianying, ZHANG Lin, LI Zehui
    Computer Engineering and Applications    2025, 61 (1): 24-41.   DOI: 10.3778/j.issn.1002-8331.2407-0038
    Abstract200)      PDF(pc) (7781KB)(184)       Save
    In recent years, deep learning, driven by high-performance GPU computing, has rapidly expanded into security, healthcare, and industry. Object detection models have evolved from traditional methods to convolutional neural networks (CNN), significantly saving resources. This review outlines the development of object detection and recent advances in deep learning by referencing extensive literature and following a two-stage framework. It compares model performance across different datasets, summarizes the strengths and weaknesses of various methods, and highlights key datasets. The review also discusses the practical applications of object detection algorithms, particularly in autonomous driving, medical imaging, and remote sensing. Finally, it explores the opportunities and challenges for future research in deep learning-driven object detection.
    Reference | Related Articles | Metrics
    Applications of Deep Learning in Knowledge Graph Construction and Reasoning
    SUN Yu, LIU Chuan, ZHOU Yang
    Computer Engineering and Applications    2025, 61 (6): 36-52.   DOI: 10.3778/j.issn.1002-8331.2408-0280
    Abstract176)      PDF(pc) (892KB)(180)       Save
    Knowledge graphs, as a structured form of knowledge representation in the field of natural language processing, can describe concepts and their relationships in the real world, and is often used in information retrieval, data management, and other fields. Deep learning has gradually become an emerging research hotspot due to its ability to automatically learn the underlying patterns and hierarchical representations from diverse data, which can be used for precise construction and effective reasoning of large-scale, high-quality knowledge graphs. To further promote the technological integration of deep learning and knowledge graphs, this paper focuses on the construction and reasoning processes of knowledge graphs, providing a comprehensive introduction to the relevant theories and latest research achievements in the fields of knowledge representation, knowledge extraction, knowledge fusion, and knowledge reasoning using deep learning. At the same time, according to the research trend in recent years, the paper highlights and summarizes the latest research results on the integration of graph deep learning and knowledge reasoning applicable to graph data feature inference. Finally, an overview and technical outlook are made on the integration and development of deep learning and knowledge graphs, providing reference and ideas for future research directions.
    Reference | Related Articles | Metrics
    Small Defect Detection Algorithm of Particle Board Surface Based on Improved YOLOv5s
    ZHA Jian, CHEN Xianzhong, WANG Wencai, GUAN Yuyin, ZHANG Jie
    Computer Engineering and Applications    2024, 60 (17): 158-166.   DOI: 10.3778/j.issn.1002-8331.2305-0475
    Abstract149)      PDF(pc) (4887KB)(179)       Save
    An improved algorithm YOLOv5s-ATG for defecting particle board defects, based on YOLOv5s, is proposed to address the problem of poor precision in small target detection of particle board defect detection at present. To overcome the issue of particle board defects with small targets and large-scale changes, the original detector head is combined with the adaptive spatial feature fusion (ASFF) network to obtain better feature fusion. Transformer module is introduced into the backbone network, which uses a multi head self-attention mechanism to capture global spatial relationships and enhance the feature extraction capability of the network. For balancing the accuracy and complexity of the model, the Ghostv2 module is added to the backbone and neck of the network to improve the real-time performance of the algorithm. The experimental results show that the mean average precision (mAP) of the improved algorithm in the actual particle board defect data set can reach 0.901, which is 0.046 higher than the original model; for small target defect Gluespots, mAP is increased by 0.138.
    Reference | Related Articles | Metrics
    LF-YOLO for Strip Surface Defect Detection in Industrial Scenes
    MA Xiaoyao, LI Rui, LI Zili, ZHAI Wenzheng
    Computer Engineering and Applications    2024, 60 (18): 78-87.   DOI: 10.3778/j.issn.1002-8331.2404-0411
    Abstract172)      PDF(pc) (4872KB)(178)       Save
    Aiming at the problem of low accuracy of traditional defect detection algorithms in practical applications due to the small size of strip surface defects and blurry collected images in industrial scenarios, an LF-YOLO algorithm for strip surface defect detection in industrial scenarios is proposed. The model upsamples the input pixels by designing a local filling upsampling module to improve the  recognition ability of blurred images, and reduce the  missed detection rate of small target defects. The FReLU activation function that focuses on visual tasks is introduced to improve the accuracy of model location defects. In addition, a lightweight local attention mechanism is proposed and combined with the feature extraction module C2f to enhance the feature extraction capability of defects of different sizes during the feature extraction process of the model. Experimental results on the Northeastern University open source strip steel dataset NEU-DET and GC10-DET show that the average detection accuracy of the improved model is 7.0 and 15.4 percentage points higher than the accuracy of the original YOLOv8 algorithm, and is better than other classic target detection models. It has advantages in average detection accuracy, and the validity of each module is further verified through ablation experiments.
    Reference | Related Articles | Metrics
    Overview of Multi-View 3D Reconstruction Techniques in Deep Learning
    WANG Wenju, TANG Bang, GU Zehua, WANG Sen
    Computer Engineering and Applications    2025, 61 (6): 22-35.   DOI: 10.3778/j.issn.1002-8331.2405-0328
    Abstract153)      PDF(pc) (3077KB)(178)       Save
    In order to solve the problems that classic multi-view 3D reconstruction methods are difficult to reconstruct complex objects and have poor reconstruction results, and to extend to high resolution, deep learning methods are introduced to reconstruct 3D models with higher accuracy. Thus multi-view 3D reconstruction algorithm using deep learning methods are systematically summarized, analyzed and compared, and the multi-view 3D reconstruction algorithms in recent years are classified and sorted out according to explicit geometry and implicit geometry representations. Neural implicit 3D reconstruction algorithms that combines implicit functions and volume rendering are mainly introduced, which currently have a high accuracy in reconstruction results, and the quantitative and qualitative analyses are conducted on some of these algorithms. In addition, commonly used datasets and evaluation indicators are listed, and the future research trends and development directions are discussed.
    Reference | Related Articles | Metrics