Most Download articles

    Published in last 1 year | In last 2 years| In last 3 years| All| Most Downloaded in Recent Month| Most Downloaded in Recent Year|

    Most Downloaded in Recent Month
    Please wait a minute...
    For Selected: Toggle Thumbnails
    Review of Research on Artificial Intelligence in Traditional Chinese Medicine Diagnosis and Treatment
    SU Youli, HU Xuanyu, MA Shijie, ZHANG Yuning, Abudukelimu Abulizi, Halidanmu Abudukelimu
    Computer Engineering and Applications    2024, 60 (16): 1-18.   DOI: 10.3778/j.issn.1002-8331.2312-0400
    Abstract201)      PDF(pc) (6171KB)(224)       Save
    The field of traditional Chinese medicine (TCM) diagnosis and treatment is gradually moving towards standardization, objectification, modernization, and intelligence. In this process, the integration of artificial intelligence (AI) has greatly propelled the advancement of TCM diagnosis and treatment, scientific research, and TCM inheritance. The review starts from the current research status of AI in TCM, combs through the application and development of AI in TCM in three stages from expert system and rule engines, traditional machine learning algorithm to deep learning, and then summarizes the knowledge management tools and large language models of TCM in recent years. Finally, this paper analyzes the multiple challenges of data fairness, multimodal data understanding, model robustness, personalized medicine, and interpretability that exist at this stage of AI in TCM. To address these challenges, it is necessary to continuously explore and propose possible solutions to promote the in-depth development of intelligent TCM diagnosis and treatment, thus better meeting the health needs of people.
    Reference | Related Articles | Metrics
    Review of Application of Visual Foundation Model SAM in Medical Image Segmentation
    SUN Xing, CAI Xiaohong, LI Ming, ZHANG Shuai, MA Jingang
    Computer Engineering and Applications    204, 60 (17): 1-16.   DOI: 10.3778/j.issn.1002-8331.2401-0136
    Abstract157)      PDF(pc) (7912KB)(173)       Save
    With the continuous development of foundation models technology, visual foundation model represented by the segment anything model (SAM) has made significant breakthroughs in the field of image segmentation. SAM, driven by prompts, accomplishes a series of downstream segmentation tasks, aiming to address all image segmentation issues comprehensively. Therefore, the application of SAM in medical image segmentation is of great significance, as its generalization performance can adapt to various medical images, providing healthcare professionals with a more comprehensive understanding of anatomical structures and pathological information. This paper introduces commonly used datasets for image segmentation, provides detailed explanations of SAM’s network architecture and generalization capabilities. It focuses on a thorough analysis of SAM’s application in five major categories of medical images: whole-slide imaging, magnetic resonance imaging, computed tomography, ultrasound, and multimodal images. The review summarizes the strengths and weaknesses of SAM, along with corresponding improvement methods. Combining current challenges in the field of medical image segmentation, the paper discusses and anticipates future directions for SAM’s development.
    Reference | Related Articles | Metrics
    Cross-Attention Fusion Learning of Transformer-CNN Features for Person Re-Identification
    XIANG Jun, ZHANG Jincheng, JIANG Xiaoping, HOU Jianhua
    Computer Engineering and Applications    2024, 60 (16): 94-104.   DOI: 10.3778/j.issn.1002-8331.2311-0452
    Abstract112)      PDF(pc) (4546KB)(141)       Save
    Convolutional neural networks (CNN) focus on local features and have difficulty to obtain global structural information. Transformer networks model long-distance feature dependence, but tend to ignore local feature details. Based on cross-attention fusion learning, a person re-identification algorithm is proposed in this paper, which combines the strengths of CNN and Transformer feature learning networks to enrich the local features of pedestrians and improve the global feature representation. The proposed model consists of three parts: the CNN branch mainly extracts local details; the Transformer branch focuses on global feature information; the cross-attention fusion branch calculates the correlation of the features from the above two branches by using the self-attention mechanism, then realizes the feature fusion, and finally improves the representation ability of the model. The ablation experiments and experimental results on Market1501 and DukeMTMC-reID datasets demonstrate the effectiveness of the proposed method.
    Reference | Related Articles | Metrics
    Small Defect Detection Algorithm of Particle Board Surface Based on Improved YOLOv5s
    ZHA Jian, CHEN Xianzhong, WANG Wencai, GUAN Yuyin, ZHANG Jie
    Computer Engineering and Applications    204, 60 (17): 158-166.   DOI: 10.3778/j.issn.1002-8331.2305-0475
    Abstract99)      PDF(pc) (4887KB)(137)       Save
    An improved algorithm YOLOv5s-ATG for defecting particle board defects, based on YOLOv5s, is proposed to address the problem of poor precision in small target detection of particle board defect detection at present. To overcome the issue of particle board defects with small targets and large-scale changes, the original detector head is combined with the adaptive spatial feature fusion (ASFF) network to obtain better feature fusion. Transformer module is introduced into the backbone network, which uses a multi head self-attention mechanism to capture global spatial relationships and enhance the feature extraction capability of the network. For balancing the accuracy and complexity of the model, the Ghostv2 module is added to the backbone and neck of the network to improve the real-time performance of the algorithm. The experimental results show that the mean average precision (mAP) of the improved algorithm in the actual particle board defect data set can reach 0.901, which is 0.046 higher than the original model; for small target defect Gluespots, mAP is increased by 0.138.
    Reference | Related Articles | Metrics
    Comprehensive Review of Large Language Model Fine-Tuning
    ZHANG Qintong, WANG Yuchao, WANG Hexi, WANG Junxin, CHEN Hai
    Computer Engineering and Applications    204, 60 (17): 17-33.   DOI: 10.3778/j.issn.1002-8331.2312-0035
    Abstract93)      PDF(pc) (6335KB)(118)       Save
    The rise of large-scale language models signifies a new milestone in the field of deep learning, with fine-tuning techniques playing a crucial role in optimizing model performance. This paper provides a comprehensive overview of fine-tuning techniques for large-scale language models. It reviews the development stages of language models, including statistical language models, neural network language models, pre-trained language models, and large language models. The basic concepts of fine-tuning are explored, covering classic fine-tuning, efficient parameter fine-tuning, prompt tuning, and reinforcement learning fine-tuning. The paper delves into the principles and development of each fine-tuning technique, offering a comparative analysis across these four major categories. In conclusion, the paper summarizes the current state of research on fine-tuning techniques and underscores the potential research value in this domain, providing insights into future directions of development.
    Reference | Related Articles | Metrics
    Improved Road Object Detection Algorithm for YOLOv8n
    GAO Deyong, CHEN Taida, MIAO Lan
    Computer Engineering and Applications    2024, 60 (16): 186-197.   DOI: 10.3778/j.issn.1002-8331.2403-0383
    Abstract107)      PDF(pc) (9556KB)(100)       Save
    Addressing the challenges posed by varying object scales and complex background interference that result in low detection accuracy and high missed detection rates in road scenes, an enhanced road object detection algorithm is proposed based on YOLOv8n. Firstly, the diverse branch block (DBB) is introduced to construct the C2fDBB module, replacing the original C2f module, thereby enhancing the network capacity to extract multi-scale features. Secondly, building upon the path aggregation network (PANet), the asymptotic feature pyramid network (AFPN) concept is leveraged to propose the path aggregation progressive feature pyramid network (PA-AFPN) feature fusion method, enhancing the network ability to integrate multi-scale features effectively. Additionally, the SPPF (spatial pyramid pooling fast) with dual-branch structure incorporating triplet attention (SPPF2_TA) module is designed, which efficiently integrates multi-scale information through an average pooling branch and triplet attention (TA) mechanism, effectively reducing the impact of background interference on detection. Finally, MPDIoU is adopted as the new boundary regression loss function to replace the original loss function, expediting algorithm convergence and enhancing object localization precision. Experimental results on the public road benchmark datasets BDD100K and SODA10M demonstrate that the improved algorithm achieves an increase of 5.7?percentage points and 7.3?percentage points in mAP@0.5 compared to baseline algorithms, with a reduction in computational load by 0.6 GFLOPs. Compared to other mainstream object detection methods, the proposed algorithm shows notable advantages in terms of FLOPs, FPS, and mAP@0.5, making it more suitable for object detection tasks in road scenes.
    Reference | Related Articles | Metrics
    Improved Road Defect Detection Algorithm Based on YOLOv8
    WANG Xueqiu, GAO Huanbing, JIA Zemeng
    Computer Engineering and Applications    204, 60 (17): 179-190.   DOI: 10.3778/j.issn.1002-8331.2404-0288
    Abstract71)      PDF(pc) (5995KB)(99)       Save
    Various defects can emerge on the road surface after prolonged use. Failing to promptly detect and repair these defects can significantly reduce the road’s lifespan and jeopardize driving safety. Consequently, real-time detection of road defects assumes paramount importance. However, traditional detection methods suffer from sluggish speed and hefty cost requirements. Hence, to tackle these challenges, a novel road detection algorithm called DML-YOLO is proposed, which builds upon the YOLOv8 framework. This algorithm integrates the MultiPath coordinate attention (MPCA) mechanism into the backbone network to enhance feature extraction. Additionally, the C2f-MPDC module is introduced to dynamically adjust the receptive field and improve detection capabilities. Furthermore, the network’s neck structure is redesigned, introducing a novel diversity feature pyramid network (DFPN) that reduces model size and fuses low-level feature maps to extract rich, detailed information and elevate the success rate of detecting small targets. Moreover, a lightweight shared convolutional detection head (LSCD head) is meticulously designed to enhance detection efficiency while reducing model size. Ultimately, extensive experimental results demonstrate that DML-YOLO achieves remarkable average detection precision, with mAP@0.5 scores of 89.6% on the RDD2022 dataset and 73.6% on the VOC2007 dataset, surpassing other models tested. Additionally, compared to the YOLOv8 model, DML-YOLO boasts a reduction of 32.37% in parameter count and 14.49% in computational workload, making it highly suitable for deployment in resource-constrained computing environments like embedded systems and mobile devices.
    Reference | Related Articles | Metrics
    Survey of Pattern Mining Methods Based on Biological Heuristic Algorithms
    HAN Meng, HE Feifei, ZHANG Ruihua, LI Chunpeng, MENG Fanxing
    Computer Engineering and Applications    2024, 60 (16): 19-33.   DOI: 10.3778/j.issn.1002-8331.2401-0427
    Abstract88)      PDF(pc) (5424KB)(87)       Save
    Frequent itemset mining, association rule mining and high utility itemset mining are three related and developing fields in pattern mining. In recent years, because traditional algorithms cannot cope with the explosive growth of data environment, heuristic algorithms have become a research hotspot in pattern mining methods. In order to reveal the research and development status in the field of pattern mining, firstly, the research results of frequent patterns and high utility pattern methods are comprehensively analyzed and summarized from the perspective of many biological heuristic algorithms, such as particle swarm optimization, genetic algorithm, ant colony optimization, and artificial bee colony and so on. Secondly, different biological heuristic pattern mining methods are summarized from strategy, comparison algorithm, datasets, advantages and disadvantages, and the experimental results and performance indicators of the same datasets are compared and analyzed in detail. Finally, in view of the shortcomings of the current biological heuristic pattern mining methods, the next research direction is put forward, including dynamic data flow, multi-objective evolution, fuzzy computing and complex data types.
    Reference | Related Articles | Metrics
    2024-16
    Computer Engineering and Applications    2024, 60 (16): 0-0.  
    Abstract53)      PDF(pc) (696KB)(84)       Save
    Related Articles | Metrics
    Multi-Label Text Classification Combining Bidirectional Attention and Contrast Enhancement Mechanism
    LI Jiandong, FU Jia, LI Jiaqi
    Computer Engineering and Applications    2024, 60 (16): 105-115.   DOI: 10.3778/j.issn.1002-8331.2311-0079
    Abstract69)      PDF(pc) (4413KB)(83)       Save
    A multi-label text classification model that integrates bidirectional attention and contrast enhancement mechanism is proposed to address the issues of missing semantic information during text sequence growth and ignoring rich knowledge from existing instances when predicting specific labels. Firstly, combining BERT word embedding, the CTransformer model is used to obtain the global dependency relationship and local structural information of the sequence. Simultaneously, bidirectional attention and label embedding are used to further generate final text and label representations, text information is interacted with label information to obtain more comprehensive semantic information. Then, a contrast enhancement mechanism is used for KNN instance retrieval, a multi-label contrastive learning objective is designed to make the model aware of the KNN classification process and improve the quality of retrieving neighbor instances during the inference process. Finally, the classifier performs text classification based on label representation and text representation. To evaluate the performance of the model, it is tested on three publicly available English datasets, and the experimental results show that the proposed model outperforms P@K and nDCG@K compared to other mainstream baseline models in terms of indicators.
    Reference | Related Articles | Metrics
    Knowledge Enhanced Dual-Channel Multi-Head Graph Convolutional Networks for Aspect-Based Sentiment Analysis
    XIE Ze, CHEN Qingfeng, MO Shaocong, LIU chunyu, QIU Junlai
    Computer Engineering and Applications    204, 60 (17): 98-106.   DOI: 10.3778/j.issn.1002-8331.2306-0126
    Abstract54)      PDF(pc) (3652KB)(79)       Save
    Aspect-based sentiment analysis (ABSA) is an important task in the field of natural language processing, and its goal is to classify the sentiment polarity of a given aspect word in a sentence. The current state-of-the-art ABSA model uses a graph neural network to process the semantic information and syntactic structure of sentences. However, these methods make insufficient use of syntactic dependency tree implication information, lack of mining of external knowledge, and ignore the removal of contextual noise introduced by the model. To address these issues, a knowledge enhanced dual-channel multi-head graph convolutional neural network is proposed. This model builds a semantic-based multi-head graph convolutional network and a syntax-based multi-head graph convolutional network. Using external emotional knowledge and syntactic dependency distance to reconstruct the syntactic dependency tree, so that the model can fully integrate external knowledge. A self-attention mechanism is used to construct a dynamic semantic map and filter the introduced noise, so as to pay more attention to aspect words. The accuracy of the model on the three public benchmark datasets Rest14, Lap14, and Twitter reaches 87.57%, 82.34%, and 77.75%, respectively, which is significantly better than the baseline model.
    Reference | Related Articles | Metrics
    Vehicle Detection of Multi-Modal Attention Fusion Under Different Illumination
    WANG Jiaqi, ZHANG Qi, HUANG Wei
    Computer Engineering and Applications    2024, 60 (16): 116-123.   DOI: 10.3778/j.issn.1002-8331.2305-0166
    Abstract55)      PDF(pc) (3643KB)(78)       Save
    Aiming at the performance degradation of existing single-modal vehicle detection algorithms caused by illumination changes, a multi-modal detection method YOLO-MMF, which combines infrared and visible light, is proposed. This method builds an efficient dual-stream feature extraction network, extracts the features of visible light images and infrared images respectively, replaces the bottleneck layer in the shallow CSP module in YOLOv5 with the DenseBlock structure, and strengthens the feature extraction ability of small targets. This method adopts feature fusion mechanism, uses discrete cosine transform to obtain high-frequency information, improves the loss of detail information due to average pooling, and combines the self-attention mechanism, so that the network can spontaneously capture the potential complementarity between modalities, thereby significantly improving vehicle detection performance. The experimental results on the DroneVehicle dataset confirm the effectiveness of the method, with an average detection accuracy improvement of 14.4 percentage points and 10.8 percentage points respectively, compared to the single-modal detection approach, which is more robust in the face of complex situations such as illumination shifts.
    Reference | Related Articles | Metrics
    Lightweight YOLOv8 Detection Algorithm for Small Object Detection in UAV Aerial Photography
    LI Yanchao, SHI Weiya, FENG Can
    Computer Engineering and Applications    204, 60 (17): 167-178.   DOI: 10.3778/j.issn.1002-8331.2402-0230
    Abstract67)      PDF(pc) (7882KB)(76)       Save
    To address the problems of difficult feature extraction and small targets being overwhelmed by noise in complex scenes for target detection in unmanned aerial vehicle (UAV) images, this paper proposes an UAV target detection algorithm called SC-YOLO based on YOLOv8s. Firstly, to learn positional details of regions of interest, a self-position module (SPM) attention based on coordinate attention (CA) is presented. Secondly, to mitigate the impact caused by channel compression of the Carafe upsampling operator, a Carafe enhancer module (CEM) is proposed. Finally, by analyzing the relationship between the gradient gain function and the size of targets in the dataset, this paper enables WIoU_v3 to focus more on the general quality anchor boxes for medium and small targets. This is validated on the VisDrone2019 dataset, where it is found that WIoU_v3 can better target the parameter setting range for general quality anchor boxes of medium and small targets. The improved YOLOv8s algorithm achieves a mean average precision (mAP) of 43.1% on the VisDrone2019 validation set and an mAP of 34.8% on the test set, demonstrating superior detection performance among algorithms of similar scale in recent years. The improved algorithm only adds 1.1×106 in terms of the number of parameters and increases the floating point operations (FLOPs) by 1.5 GFLOPs, yet it achieves a 2.0 and 2.1 percentage points increase in detection accuracy on the validation and test sets, respectively. On the Tinyperson dataset, the detection accuracy is increased by 1.4 percentage points.
    Reference | Related Articles | Metrics
    Improved YOLOv8s Model for Small Object Detection from Perspective of Drones
    PAN Wei, WEI Chao, QIAN Chunyu, YANG Zhe
    Computer Engineering and Applications    2024, 60 (9): 142-150.   DOI: 10.3778/j.issn.1002-8331.2312-0043
    Abstract314)      PDF(pc) (5858KB)(440)       Save
    Facing with the problems of small and densely distributed image targets, uneven class distribution, and model size limitation of hardware conditions, object detection from the perspective of drones has less precise results. A new improved model based on YOLOv8s with multiple attention mechanisms is proposed. To solve the problem of shared attention weight parameters in receptive field features and enhance feature extraction ability, receptive field attention convolution and CBAM (concentration based attention module) attention mechanism are introduced into the backbone, adding attention weight in channel and spatial dimensions. By introducing large separable kernel attention into feature pyramid pooling layers, information fusion between different levels of features is increased. The feature layers with rich semantic information of small targets are added to improve the neck structure. The inner-IoU loss function is used to improve the MPDIoU (minimum point distance based IoU) function and the inner-MPDIoU instead of the original loss function is used to enhance the learning ability for difficult samples. The experimental results show that the improved YOLOv8s model has improved mAP, P, and R by 16.1%, 9.3%, and 14.9% respectively on the VisDrone dataset, surpassing YOLOv8m in performance and can be effectively applied to unmanned aerial vehicle visual detection tasks.
    Reference | Related Articles | Metrics
    FLM-YOLOv8:Lightweight Mask Wearing Detection Algorithm
    GAO Min, CHEN Gaohua, GU Jiaxin, ZHANG Chunmei
    Computer Engineering and Applications    204, 60 (17): 203-215.   DOI: 10.3778/j.issn.1002-8331.2402-0226
    Abstract59)      PDF(pc) (15623KB)(75)       Save
    Aiming at the problems that the existing mask wearing detection model can’t balance the detection accuracy and speed well, the parameters are large, and the rate of missed and false detection is high, a lightweight mask wearing detection algorithm FLM-YOLOv8 is proposed. Firstly, the lightweight FasterNet is used to replace the backbone feature extraction network of YOLOv8n to improve the network detection speed. Secondly, the C2f module is improved by combining FasterNet Block to reduce the computational complexity of the model. Then, the structure of SPPF-LSKA is proposed to enhance the feature expression ability and perception ability of the model and improve the network detection accuracy. Finally, the Inner-MPDIoU bounding box regression loss function is designed to improve the regression prediction accuracy and accelerate the convergence speed. In addition, a mask wearing data set marked with a complex and diverse scene is created and enhanced with mosaic data to improve the network generalization ability. The experimental results show that the mAP@0.5 of the algorithm on the targets wearing masks correctly, not wearing masks correctly and not wearing masks reaches 91.3%, and the FPS reaches 143.6, which realizes more real-time and accurate mask wearing detection.
    Reference | Related Articles | Metrics
    Research on Path Planning Optimization Algorithm Based on Loss Function Weight Adaptation
    SUN Yuchun, WANG Sishan, TONG Leyan, CHEN Shaohui, CAO Juyang
    Computer Engineering and Applications    2024, 60 (16): 85-93.   DOI: 10.3778/j.issn.1002-8331.2311-0010
    Abstract57)      PDF(pc) (4274KB)(69)       Save
    Aiming at the influence of vehicle speed on safety and comfort when changing lanes autonomously, this paper proposes an optimal trajectory generation algorithm based on weight adaptation, the optimal trajectory is generated and selected by using a loss function weight adaptation method based on real-time vehicle speed and recursive least squares algorithm. In order to verify the effect of the method, this paper adopts the Unity3D simulation environment based on PhysX dynamics engine to construct a simulation environment based on 3D scenes, and ensures that the parameters of the control module and the planning algorithm are consistent in the simulation system, which is used for evaluating the effect of the weight-adaptive algorithm relative to the original algorithm in a variety of scenes. The experimental results show that the self-driving car can realize trajectory planning that meets the requirements of safety, smoothness and comfort indexes through the adaptive updating method of loss function weights, and the trajectory generated by the adaptive algorithm is smoother and more comfortable compared with the original algorithm. The research in this paper helps to improve the safety, smoothness and comfort of autonomous lane changing, and provides a research basis for the landing of autonomous driving in high-speed scenarios.
    Reference | Related Articles | Metrics
    Point of Interest Recommendation Methods for Fused Temporal Gated Graph Neural Networks
    TANG Hong, LIU Bin, ZHANG Jing, JIN Zhezheng
    Computer Engineering and Applications    2024, 60 (16): 124-132.   DOI: 10.3778/j.issn.1002-8331.2305-0262
    Abstract35)      PDF(pc) (3710KB)(69)       Save
    Most of the existing points of interest (point of interest, POI) recommendation systems ignore the sequential behavior mode in the user check-in sequence and the influence of users’ personalized preference on POI recommendation, which leads to the low performance of POI recommendation system and the unreliable recommendation results, and thus affects the user experience. To solve the above problems, a point of interest recommendation method is proposed to fuse the temporal gated graph neural network. Firstly, use temporal gated graph neural network (TGGNN) to learn the POI embedding. Secondly, use attention mechanism to capture long-term preferences. Then, integrate the latest preferences and real-time preferences to capture short-term preferences. Finally, the recommendation scores of the candidate POIs are calculated by combining the users’ long-term and short-term preferences and the POI recommendation is performed for the user according to the score. The experimental results show that compared with the existing methods, the proposed method can significantly improve the recall rate and average reciprocal ranking, so it can achieve good recommendation effect and has a good application prospect.
    Reference | Related Articles | Metrics
    Path Planning Study of Mobile Spraying Robot in Tomato Greenhouse
    GAO Xingwang, REN Lisheng, WANG Fang
    Computer Engineering and Applications    2024, 60 (16): 325-332.   DOI: 10.3778/j.issn.1002-8331.2306-0002
    Abstract46)      PDF(pc) (5524KB)(69)       Save
    When the mobile spraying robot operates in the tomato greenhouse, there are problems such as low planning path efficiency, poor smoothness, and potential safety hazards in the path. A path planning algorithm for tomato greenhouse mobile spraying robot with optimized A* algorithm and DWA algorithm is proposed. Fully consider the specific environment of the tomato greenhouse, define the safety distance of the operation, and expand the planting area to ensure the safe operation of the mobile spraying robot. By adding dynamic weight factors to the heuristic function, key node extraction is used to improve the efficiency of global path planning, and turning points and three B-spline curves are introduced to ensure the comprehensive coverage and smoothness of the path. Finally, the DWA algorithm is integrated to ensure that the mobile spraying robot avoids the sudden obstacles. The experimental results show that the optimized algorithm is safer and smoother than the path planned by the traditional algorithm, the coverage area is complete, the planning efficiency is significantly improved, and the fusion algorithm successfully avoids the sudden obstacles in the path. This proposed scheme meets the working requirements of mobile spraying robots in complex tomato greenhouses.
    Reference | Related Articles | Metrics
    Review of Research on Lightweight Image Super-Resolution
    ZHU Xinfeng, SONG Jian
    Computer Engineering and Applications    2024, 60 (16): 49-60.   DOI: 10.3778/j.issn.1002-8331.2403-0230
    Abstract69)      PDF(pc) (4762KB)(68)       Save
    In recent years, image super-resolution (SR) based on deep learning has received widespread attention. The purpose of image SR is to improve the resolution of images to facilitate further processing of images, such as target detection, image classification and face recognition, etc. The research on image SR has achieved rapid development in recent years, but there are still few related reviews on lightweight SR models. By analyzing the current research status of lightweight SR methods which are based on deep learning and loss function, a new classification of current lightweight SR models is made, which are traditional convolution methods and attention mechanism methods. The development history and latest progress of lightweight SR methods for images are systematically given, the advantages and disadvantages of each method are pointed out. Finally, by analyzing the existing problems of current lightweight SR technology, the future research directions of lightweight image SR method are given.
    Reference | Related Articles | Metrics
    Review of Deep Learning Algorithms for One-Stage Safety Helmet Detection
    GUAN Hanyu, LING Yun, WANG Shulei
    Computer Engineering and Applications    2024, 60 (16): 61-75.   DOI: 10.3778/j.issn.1002-8331.2312-0034
    Abstract61)      PDF(pc) (5273KB)(66)       Save
    Real-time detection of safety helmet-wearing is an essential part of smart construction sites and smart traffic, safety helmet detection based on deep learning has gradually replaced the traditional detection methods, and has made significant progress in accuracy, performance, and efficiency. It has been widely used in real scenarios. To facilitate the future research of safety helmet-wearing algorithms, the research status of object detection algorithms for safety helmets in various application scenarios is analyzed comprehensively. Firstly, the history of the object detection algorithm is summarized. Secondly, the advantages and disadvantages of different algorithms and optimizations are analyzed, and the lightweight safety helmet detection algorithms are discussed. Finally, according to the shortages of the current object detection algorithm applied in the actual scene, the future direction of a deep learning algorithm for safety helmet detection is prospected.
    Reference | Related Articles | Metrics
    Review of Data-Driven Approaches to Chinese Named Entity Recognition
    XIAO Lei, CHEN Zhenjia
    Computer Engineering and Applications    2024, 60 (16): 34-48.   DOI: 10.3778/j.issn.1002-8331.2312-0260
    Abstract72)      PDF(pc) (5433KB)(66)       Save
    Chinese named entity recognition (CNER) is a key step in Chinese information extraction task, which is the basis of downstream tasks such as question answering system, machine translation and knowledge mapping, and its methods are mainly categorized into two main types: knowledge-driven and data-driven. However, the traditional knowledge-driven methods based on rules, dictionaries and machine learning have the problems of ignoring contextual semantic information, high computational cost and low recall rate, which limit the development of CNER technology. Firstly, the definition and development history of CNER are introduced. Secondly, the typical datasets, training tools, sequence annotation methods and model evaluation indexes for CNER tasks are organized in detail. Thirdly, the data-driven methods are summarized and divided into methods based on deep learning, pre-trained language models and joint extraction of Chinese entity relations, and the practical application scenarios of data-driven methods in different fields are analyzed. Finally, the future research direction of CNER task is outlooked to provide some reference for the proposal of new methods.
    Reference | Related Articles | Metrics
    AEM-YOLOv8s:Small Target Detection Algorithm for UAV Aerial Images
    JIANG Wei, WANG Wanhu, YANG Junjie
    Computer Engineering and Applications    204, 60 (17): 191-202.   DOI: 10.3778/j.issn.1002-8331.2403-0256
    Abstract52)      PDF(pc) (5211KB)(60)       Save
    The AEM-YOLOv8s algorithm is proposed to address issues of low performance, missed detections, occlusions, and high model parameter count in small object detection in current UAV aerial imagery. Within the C2f module, the advantages of AKConv (alterable kernel convolution) and EMA (efficient multi-scale attention) are combined to design the C2f-BE module, which enhances the algorithm’s ability to process features while reducing the model parameter count. By introducing a small object detection layer and BiFPN structure, through cross-scale connections and weighted feature fusion, more shallow features are retained, reducing algorithm parameters. The design of a multi-scale feature fusion branch merges shallow features containing more small object information with deeper semantic features, reducing missed detections under occlusion and improving small object detection performance. Experimental results on the VisDrone2019 public dataset demonstrate that the AEM-YOLOv8s algorithm achieves an mAP50 of 50.1% and mAP50:95 of 31.1%, representing respective improvements of 10.8 and 7.6 percentage points over YOLOv8s, while also reducing parameters by 32.2% compared to YOLOv8s.
    Reference | Related Articles | Metrics
    ARXML Query Processing Algorithm Based on Dynamic Fusion Index Tree
    DAI Shenlong, TIAN Zhenhu, LI Chaochao, XU Fengjie, FANG Ling
    Computer Engineering and Applications    2024, 60 (16): 76-84.   DOI: 10.3778/j.issn.1002-8331.2311-0025
    Abstract52)      PDF(pc) (3524KB)(54)       Save
    With the continuous evolution of the automotive industry and the accelerated progress of its smartification, AUTOSAR has emerged as a widely adopted automotive software architecture standard. ARXML (AUTOSAR XML) serves as a crucial resource within this architectural standard, providing descriptions of electronic control units (ECUs) in vehicles. Addressing the efficiency challenges in querying ARXML documents characterized by high data density and complex content, a structured document query processing algorithm is proposed based on a dynamic fusion index tree. The algorithm commences its analysis from the rules governing node relationships, exploring the relationship rules among internal nodes within a single document and nodes across different documents. While preserving the original relationships among nodes, it constructs a node relationship structure enriched with external relationships. Subsequently, the algorithm enhances this structure based on structured document query expressions. Finally, it extends the structure into a dynamic fusion index tree, aiming to reduce the time consumption in document parsing and enhance query performance. Complexity analysis and experimental results demonstrate that the document query efficiency achieved with the dynamic fusion index tree structure surpasses existing query methods, indicating practical utility.
    Reference | Related Articles | Metrics
    Multiscale Feature Fusion Approach for Dual-Modal Object Detection
    ZHANG Rui, LI Yunchen, WANG Jiabao, CHEN Yao, WANG Ziqi, LI Yang
    Computer Engineering and Applications    204, 60 (17): 233-242.   DOI: 10.3778/j.issn.1002-8331.2305-0412
    Abstract52)      PDF(pc) (6909KB)(53)       Save
    Object detection based on visible images is difficult to adapt to complex lighting conditions such as low light, no light, strong light, etc., while object detection based on infrared images is greatly affected by background noise. Infrared objects lack color information and have weak texture  features, which pose a greater challenge. To address these problems, a dual-modal object detection approach that can effectively fuse the features of visible and infrared dual-modal images is proposed. A multiscale feature attention module is proposed, which can extract the multiscale features of the input IR and RGB images separately. Meanwhile, channel attention and spatial pixel attention is introduced to focus the multiscale feature information of dual-modal images from both channel and pixel dimensions. Finally, a dual-modal feature fusion module is proposed to adaptively fuse the feature information of dual-modal images. On the large-scale dual-modal image dataset DroneVehicle, compared with the benchmark algorithm YOLOv5s using visible or infrared single-modal image detection, the proposed algorithm improves the detection accuracy by 13.42 and 2.27 percentage points, and the detection speed reaches 164 frame/s, with ultra-real-time end-to-end detection capability. The proposed algorithm effectively improves the robustness and accuracy of object detection in complex scenes, which has good application prospects.
    Reference | Related Articles | Metrics
    Baggage Tracking Technology Based on Improved YOLO v8
    CAO Chao, GU Xingsheng
    Computer Engineering and Applications    2024, 60 (9): 151-158.   DOI: 10.3778/j.issn.1002-8331.2310-0238
    Abstract181)      PDF(pc) (6479KB)(276)       Save
    In the airport baggage sorting scenario, the traditional multi-target tracking algorithm has the problems of high target ID switching rate and high false alarm rate of target trajectory. This paper presents a baggage tracking technique based on improved YOLO v8 and ByteTrack algorithms. The CBATM module is added, the ADH decoupling head is replaced and the loss function during training is changed, the detection accuracy is increased, the discrimination of target features is strengthened, and the ID switching rate of the target is reduced. GSI interpolation processing in Byte data association, which not only uses high box and low box, but also ensures the tracking effect after a long time of occlusion, and reduces the ID error switching caused by occlusion. In the airport baggage sorting dataset, MOTA and IDF 1 reach 89.9% and 90.3%, respectively, which show a significant improvement and can steadily realize the tracking of luggage ID.
    Reference | Related Articles | Metrics
    Improved Transformer Decoding Algorithm for Dense Video Description
    YANG Dawei, PAN Xiaofang, MAO Lin, ZHANG Rubo
    Computer Engineering and Applications    204, 60 (17): 89-97.   DOI: 10.3778/j.issn.1002-8331.2306-0024
    Abstract34)      PDF(pc) (5592KB)(49)       Save
    When applying Transformer for dense video description, historical text features can interfere with subsequent text generation, making it difficult to capture dynamic video information and affecting the coherence and accuracy of the descriptions. To maintain context consistency while mitigating historical text noise, this paper proposes an improved Transformer decoding algorithm for dense video description, called D-Uformer. This algorithm utilizes feedforward neural network (FNN) to enhance the representation of historical text features. It constructs pruning branches to remove redundant information and compensatory branches to enhance contextual information through skip connections, and uses subtraction to reduce the impact of inaccurate descriptions caused by over-focusing on historical text features and improves the model’s attention to input video features. Additionally, it uses addition to compensate for the loss of contextual information during feature transfer, and generates accurate and coherent descriptions of the current video content. Experimental results on the ActivityNet and Charades datasets demonstrate a significant performance improvement of the D-Uformer algorithm. Compared to the temporally descriptive probabilistic captioning (TDPC) network, it achieves a maximum accuracy improvement of 4.816% and a maximum diversity improvement of 4.167%. The generated descriptions not only align better with the video content but also conform more to human language conventions.
    Reference | Related Articles | Metrics
    Pedestrian Intent Semantic VSLAM in Automatic Driving Scenarios
    LUO Zhaoyang, ZHANG Rongfen, LIU Yuhong, LI Jin, FAN Runze
    Computer Engineering and Applications    204, 60 (17): 107-116.   DOI: 10.3778/j.issn.1002-8331.2306-0159
    Abstract23)      PDF(pc) (5924KB)(49)       Save
    Visual simultaneous localization and mapping (VSLAM) has found extensive applications in the field of autonomous driving. However, conventional algorithms lack semantic information and are incapable of inferring or predicting pedestrians’ behaviors or intentions within a scene. This paper introduces an effective semantic VSLAM method that employs a semantic segmentation algorithm based on dense prediction transformer (DPT) to acquire segmentation masks for potential dynamic targets, enabling dynamic feature removal. Given that the majority of dynamic objects in autonomous driving scenarios are pedestrians and vehicles, in order to both reintegrate static points from potential dynamic targets and re-detect dynamic objects, a geometric constraint is employed to jointly optimize camera poses while predicting pedestrian intentions. To accurately forecast whether pedestrians are crossing the road, a dual-stream, spatiotemporal adaptive graph convolutional neural network is built using human skeletal information to predict pedestrian jaywalking intentions. The results validated on the KITTI dataset indicate that the proposed approach, in comparison to the ORB-SLAM3 algorithm, has a certain reduction in absolute trajectory estimation errors, demonstrating superior precision compared to algorithms of similar nature. This method holds the potential to furnish autonomous driving systems with richer semantic information, thereby enhancing the accomplishment of autonomous driving tasks.
    Reference | Related Articles | Metrics
    Improved Lightweight Ship Target Detection Algorithm for Optical Remote Sensing Images with YOLOv8
    YANG Zhiyuan, LUO Liang, WU Tianyang, YU Boxiang
    Computer Engineering and Applications    2024, 60 (16): 248-257.   DOI: 10.3778/j.issn.1002-8331.2403-0217
    Abstract46)      PDF(pc) (4752KB)(48)       Save
    Aiming at the low accuracy and slow detection speed faced by existing deep learning-based lightweight target detection algorithms when they are applied to the task of detecting ship targets in optical remote sensing images, a lightweight ship target detection algorithm based on YOLOv8s is proposed for optical remote sensing images. A new lightweight asymmetric detection head is introduced to make the model pay more attention to ship targets in complex backgrounds. The backbone network fusion selects an attention module to improve the performance of target detection by dynamically adjusting the sensing field of the feature extraction backbone. The idea of Slim-FPN is introduced to improve the neck part, which reduces the number of parameters while maintaining the detection accuracy. A new fast convolutional module FasterConv is designed, based on which, the bottleneck structure in C2f is reconstructed and named Faster_C2f, which enhances the feature extraction ability of the network. The experimental results show that the improved algorithm achieves a detection accuracy of 95.2% while ensuring the detection speed, which is 1.4% higher than the baseline model, the number of detected frames per second is increased by 8%, and the model parameters are reduced by 33%, which is a certain improvement over the mainstream algorithms in terms of detection effect.
    Reference | Related Articles | Metrics
    Self-Supervised Graph Representation Learning Method Based on Data and Feature Augmentation
    XU Yunfeng, FAN Hexun
    Computer Engineering and Applications    204, 60 (17): 148-157.   DOI: 10.3778/j.issn.1002-8331.2306-0254
    Abstract45)      PDF(pc) (4155KB)(48)       Save
    Graph representation learning plays a crucial role in handling graph data structures, but it faces a significant challenge of heavy reliance on labeled information. To overcome this challenge, a novel self-supervised graph representation learning framework is proposed. By leveraging contrastive learning methods, it integrates the structural and attribute information of the original graph, as well as the high- and low-frequency information in the spectral domain, enhancing the preserved node information. Additionally, residual fusion and unbiased feature augmentation are employed to ensure feature effectiveness while further reducing bias in augmented samples. Moreover, in the contrastive part, the probability of negating the samples as true is estimated, and weights are used to measure the hardness and similarity of negations. Experiments on three public datasets prove that the performance in the downstream tasks of node classification is not only better than the current state-of-the-art unsupervised methods but also surpasses previous supervised methods in most tasks.
    Reference | Related Articles | Metrics
    Imbalanced Classification Method Based on Cross-Class Sample Migration Framework
    YU Haibo, LIU Jing, LI Qiangwei, GAO Xin, TAN Huang, CHEN Tianyang
    Computer Engineering and Applications    2024, 60 (16): 143-158.   DOI: 10.3778/j.issn.1002-8331.2305-0191
    Abstract41)      PDF(pc) (5707KB)(48)       Save
    For imbalanced classification problems, achieving a balance in the number and distribution of samples in overlapping region is the key to alleviate the subsequent decision bias. Existing imbalanced classification methods often generate new samples only from minority samples to balance the number of different class samples, but do not make full use of the rich information of majority samples. Especially when the absolute number of minority samples is too small, only using the original minority sample information cannot effectively balance the distribution of samples in overlapping regions. An imbalanced classification method based on cross-class sample migration framework is proposed. Firstly, a mapping network constructed by the fully connected layer is embedded in the variational autoencoder (VAE) hidden code sampling process. By fully learning the commonality and characteristics of different classes of samples, the hidden code of majority samples is mapped and transformed under the influence of hidden coding prior constraints and cross-domain consistency constraints. This makes the hidden codes before and after conversion share the same distribution space, and enables the decoder in VAE to migrate majority samples to minority samples. At the same time, a generative confrontation mechanism is introduced to discriminate the original sample and the new sample, as well as the hidden codes before and after conversion, to further improve the reliability of the migrated sample. Furthermore, the distances between the newly generated samples and the original samples of different categories are weighted, and the samples closer to the overlapping region are obtained by screening, so that the number and distribution of different types of samples in the overlapping region are more balanced. Experimental results on 16 public datasets show that the proposed method is significantly superior to 10 typical imbalanced classification methods in F1 measure and G-mean. Especially in 11 public datasets with high imbalance ratio and small absolute number of minority samples, the performance improvement of the proposed method is more significant.
    Reference | Related Articles | Metrics
    Text-to-Image Generation Method Based on Attention and Dynamic Memory Module
    ZHANG He, LEI Haopeng, WANG Mingwen, ZHANG Shangkun
    Computer Engineering and Applications    204, 60 (17): 224-232.   DOI: 10.3778/j.issn.1002-8331.2312-0186
    Abstract34)      PDF(pc) (4328KB)(47)       Save
    Aiming at the problems existing in multi-stage generative models in the text generation image task, such as the lack of image texture information features and the poor consistency between text descriptions and generated images, this paper proposes a novel generative adversarial network (ADM-GAN) model. The model is optimized using attention and dynamic memory modules. In the initial stage, the text description is converted into embedding vectors through a text encoder, and a generator is used to combine random noise to generate low-resolution images. Then, the paper introduces spatial attention and channel attention modules, aiming to fuse low-resolution image hidden features with important word-level semantic features, thereby ensuring the consistency of text description and image features. Finally, the dynamic memory module is used to capture the semantic correspondence between text and images, and dynamically adjust the memory content according to the generation process, refine the image texture, and improve the text-to-image synthesis effect. Through comparative experiments on the public CUB and COCO data sets, compared with previous methods, the Fréchet inception distance and inception score of this paper have been significantly improved, proving that this model can solve the problem of lack of image details and semantic information to a certain extent. It effectively improves the consistency between images and text, and achieves better results.
    Reference | Related Articles | Metrics
    Review of Frequent Temporal Pattern Mining Methods
    TANG Zengjin, XU Zhenshun, SU Mengyao, LIU Na, WANG Zhenbiao, ZHANG Wenhao
    Computer Engineering and Applications    204, 60 (17): 48-61.   DOI: 10.3778/j.issn.1002-8331.2403-0114
    Abstract29)      PDF(pc) (5060KB)(44)       Save
    Frequent temporal pattern mining refers to the process of discovering frequently occurring patterns or patterns from time series data. Its purpose is to help understand important features in time series data, such as periodicity, trends, and anomalies, which can help predict future development trends and identify abnormal situations. Based on literature research on frequent temporal pattern mining methods in recent years, they are divided into three categories according to key technologies and representative algorithms, namely structural constraint based frequent temporal pattern mining methods, parameter constraint based frequent temporal pattern mining methods, and window based frequent temporal pattern mining methods. Firstly, the background of frequent temporal pattern mining methods and the characteristics of each method are described. Secondly, the development and classification of three mining methods are introduced, and a detailed comparative analysis is conducted on the advantages, disadvantages, and performance of each improved method. Finally, the frequent temporal pattern mining methods are summarized and summarized, and the future research directions of frequent temporal pattern mining methods are discussed.
    Reference | Related Articles | Metrics
    Omni-Frequency Image Denoising with Multi-Head Attention
    JIANG Jielin, SHI Mingyue, YANG Haidong, CUI Yan
    Computer Engineering and Applications    2024, 60 (16): 236-247.   DOI: 10.3778/j.issn.1002-8331.2311-0430
    Abstract41)      PDF(pc) (5451KB)(43)       Save
    In recent years, deep convolutional neural networks (CNN) have achieved significant results in the field of image denoising. However, most existing denoising methods directly input noisy images into CNN models for training, relying on cropping a large number of image training blocks. The repeatedly cropped regions not only waste computing resources, but also limit the diversity of feature extraction, resulting in the loss of image texture details. To address these issues, this paper proposes an omni-frequency enhanced multi-head attention network (OEMANet) for removing additive Gaussian white noise and real-world noise. The noisy image is decomposed into low- and high-frequency components, these two components and the noisy image are then simultaneously input into OEMANet for training. By increasing the network width, richer image features are extracted. An enhanced multi-head attention mechanism focuses on features at the image level, and recovers more texture details. To obtain accurate noise mappings, a noise learning module is used to remove redundant features and optimize the remaining features of the image. In this paper, the effectiveness of OEMANet is verified on multiple datasets such as Set12 and CBSD68. The experimental results show that the method proposed in this paper is superior to mainstream denoising methods such as ADNet, AMDNet, MWDCNN in terms of grayscale noise image denoising, color noise image denoising, and real image denoising. Moreover, the image denoised by OEMANet has a clearer visual performance.
    Reference | Related Articles | Metrics
    Computer Engineering and Applications    204, 60 (17): 0-0.  
    Abstract7)      PDF(pc) (695KB)(42)       Save
    Related Articles | Metrics
    IDFE:Fingerprint Deep Extraction Method for IoT Device Identification
    TANG Yuezhong, LU Shida, QIAN Lifeng, WEI Xueyin, GU Rongbin, HUANG Jun, LI Jing
    Computer Engineering and Applications    204, 60 (17): 117-128.   DOI: 10.3778/j.issn.1002-8331.2306-0161
    Abstract29)      PDF(pc) (5582KB)(42)       Save
    Traditional IoT device fingerprint extraction methods usually use the private data in traffic to generate device fingerprints and adopt the method of manually designing features. It also limits the performance of the model while creating security risks. Aiming at the above problems, the IoT device deep fingerprint extraction (IDFE) method based on network traffic is proposed. IDFE first divides the network traffic pcap file into multiple sessions, and extracts the non-private information to build the session information matrix. Then it designs the modeling method and fusion method of the dependency between the different information sequences of the session information matrix and the temporal dependency between the session data packets. Finally, the designed full convolution transformer is used to extract the device behavior features in the fused session feature matrix and generate the device fingerprint.
    Reference | Related Articles | Metrics
    Comprehensive Review of ROV Underwater Obstacle Detection and Avoidance Technology
    LI Minggui, ZHOU Huanyin, GONG Liwen
    Computer Engineering and Applications    204, 60 (17): 34-47.   DOI: 10.3778/j.issn.1002-8331.2312-0206
    Abstract51)      PDF(pc) (5168KB)(42)       Save
    This paper provides a comprehensive review of the technological advancements in underwater obstacle detection and avoidance techniques for remotely operated vehicles (ROV). The research focuses on sonar systems, optical systems, and their integration with machine learning and artificial intelligence algorithms, analyzing how these technologies enhance the autonomy, efficiency, and safety of underwater operations. Despite significant achievements in environmental adaptability and obstacle detection accuracy achieved by sonar and optical systems, challenges remain in real-time identification of dynamic obstacles and adaptation to complex environments. Furthermore, the potential and challenges of machine learning and artificial intelligence technologies in enhancing ROV’s autonomous obstacle avoidance capability are discussed, highlighting the importance of these technologies in future ROV operations. This research provides new theoretical perspectives and practical applications for deep-sea exploration and marine science.
    Reference | Related Articles | Metrics
    Small Sample Steel Plate Defect Detection Algorithm of Lightweight YOLOv8
    DOU Zhi, GAO Haoran, LIU Guoqi, CHANG Baofang
    Computer Engineering and Applications    2024, 60 (9): 90-100.   DOI: 10.3778/j.issn.1002-8331.2311-0070
    Abstract245)      PDF(pc) (5010KB)(310)       Save
    The surface area of steel plate is large, and the surface defects are very common, and showing the characteristics of multi-class and small amount. Deep learning is difficult to be effectively applied to the detection of such small sample defects. In order to solve this problem, a small sample steel plate defect detection algorithm based on lightweight YOLOv8 is proposed. Firstly, an interactive data augmentation algorithm based on fuzzy search is proposed, which can effectively solve the problem that the network model cannot be effectively trained due to the lack of training samples, making it possible for deep learning to be applied in this field. Then, the LMRNet (lightweight multi-scale residual networks) network is designed to replace the backbone of YOLOv8, to achieve the lightweight of the network model and improve its portability. Finally, the CBFPN (context bidirectional feature pyramid network) and ECSA (efficient channel spatial attention) modules are proposed to make the network more effective in extracting and fusing scar features, and the Wise-IoU loss function is adopted to improve the detection performance. The comparative experimental results show that compared with the original YOLOv8 algorithm, the amount of parameters of the improved network is only 30% of the original network, the amount of calculation is 49% of the original network, the FPS is increased by 9 frame/s. The accuracy rate, recall rate and mAP have increased by 2.9, 6.5 and 5.5 percentage points respectively. Experimental results fully verify the advantages of the proposed algorithm.
    Reference | Related Articles | Metrics
    X-Ray Contraband Detection Algorithm Based on Improved YOLOv5
    ZENG Hongxiang, WEN Zhicheng
    Computer Engineering and Applications    2024, 60 (16): 217-227.   DOI: 10.3778/j.issn.1002-8331.2305-0297
    Abstract48)      PDF(pc) (5865KB)(40)       Save
    Aiming at the contraband detection efficiency of security X-ray images and the problem of missing and false detection of small size contraband, an X-ray contraband detection algorithm based on improved YOLOv5 is proposed. ProFPN structure is introduced in this algorithm, which can increase the participation of original feature information on the basis of FPN+PAN and improves the detection accuracy. Compared with the original YOLOv5, a small target detection layer of 160×160 is added to make it have four-scale feature fusion, which improves the learning ability of small-size targets. The size of anchor frame is reconstructed using k-means++ algorithm to make it more suitable for the target frame size of self-made dataset and improve the detection efficiency. EIOU Loss is adopted as a regression loss function, which minimizes the difference between the width and height of the target frame and the anchor frame, and further improves the positioning accuracy and convergence speed of the detection frame. Experimental results show that compared with the original YOLOv5 algorithm mAP@0.5, the improved algorithm has an increase of 4.7 percentage points in the public X-ray security dataset. Compared with other mainstream target detection algorithms, mAP@0.5 improves by 28.6 percentage points at most when the number of parameters and operation amount are minimal, which also has certain advantages.
    Reference | Related Articles | Metrics
    Computer Engineering and Applications    2024, 60 (10): 0-0.  
    Abstract84)      PDF(pc) (756KB)(141)       Save
    Related Articles | Metrics
    Modeling and Evaluation Framework for Constrained Dataflow in Spatial Accelerators
    HE Yuxing, WANG Teng, TENG Wenbin, GONG Lei
    Computer Engineering and Applications    204, 60 (17): 74-88.   DOI: 10.3778/j.issn.1002-8331.2311-0443
    Abstract32)      PDF(pc) (6125KB)(39)       Save
    Deploying tensor computation tasks on spatial accelerators has been proven to effectively improve the execution speed and efficiency of tensor computations. To effectively deploy tensor computation on spatial accelerators, various dataflow modeling and evaluation frameworks have been proposed in academia. These frameworks enable quick evaluation of dataflows for efficient design space exploration. However, these frameworks lack fine-grained descriptions of the hardware structure, making it challenging to effectively model the constraints imposed by the hardware structure on the dataflow. As a result, they fail to explore the design space of dataflows constrained by real spatial accelerators effectively. To address this issue, this paper firstly provides a fine-grained modeling of the hardware architecture, using a multi-level spatial accelerator hardware structure as a template. Each level consists of three components: array structure, storage structure, and interconnect network structure, to respectively describe the constraints of the hardware architecture on spatial unfolding of data flow, storage capacity, and data transmission methods. Then, this paper proposes a tensor computation task and dataflow modeling approach that can solve the resource requirements of the dataflow. Based on this, the paper further proposes a dataflow evaluation framework, consisting of three parts:requirement analysis, constraint analysis, and performance analysis. The requirement analysis is used to determine the demands of computation tasks and dataflows on hardware resources. The constraint analysis aims to examine whether the dataflow violates hardware structure constraints. The performance analysis is used to evaluate performance metrics such as latency, data reuse, and resource utilization of the dataflow. Experimental results demonstrate that compared to the state-of-the-art evaluation framework, the proposed framework reduces the error in latency evaluation, and effectively supports the exploration of constrained dataflow design space.
    Reference | Related Articles | Metrics