Most Download articles

    Published in last 1 year | In last 2 years| In last 3 years| All| Most Downloaded in Recent Month | Most Downloaded in Recent Year|

    In last 2 years
    Please wait a minute...
    For Selected: Toggle Thumbnails
    Review of Fault Diagnosis Techniques for UAV Flight Control Systems
    AN Xue, LI Shaobo, ZHANG Yizong, ZHANG Ansi
    Computer Engineering and Applications    2023, 59 (24): 1-15.   DOI: 10.3778/j.issn.1002-8331.2305-0137
    Abstract389)      PDF(pc) (917KB)(1468)       Save
    In recent years, unmanned aerial vehicles(UAVs) have been widely used in various complex fields of military and civilian applications due to their unique advantages such as low operating costs and high mobility. At the same time, the complex and diverse missions have put forward higher requirements for the reliability and safety of UAV systems. The UAV fault diagnosis technology can provide timely and accurate diagnosis results, which helps the maintenance, repair and servicing of UAVs, and is of great significance in enhancing the combat effectiveness of UAVs. Therefore, this paper firstly analyses UAV flight control systems, and classifies the faults. Secondly, the research methods and status quo of UAV fault diagnosis technology are analysed and summarised. Finally, the main challenges faced by UAV fault diagnosis technology are discussed and the future development direction is pointed out; the aim is to provide some reference for researchers in the field of UAV fault diagnosis technology and to promote the improvement of UAV fault diagnosis technology level in China.
    Reference | Related Articles | Metrics
    Research Progress on Vision System and Manipulator of Fruit Picking Robot
    GOU Yuanmin, YAN Jianwei, ZHANG Fugui, SUN Chengyu, XU Yong
    Computer Engineering and Applications    2023, 59 (9): 13-26.   DOI: 10.3778/j.issn.1002-8331.2209-0183
    Abstract1523)      PDF(pc) (787KB)(1117)       Save
    Fruit picking robot is of great significance to the realization of automatic intelligence of fruit equipment. In this paper, the research work on the key technologies of fruit-picking robot at home and abroad in recent years is summarized, firstly, the key technologies of fruit-picking robot vision system, such as traditional image segmentation methods based on fruit features, such as threshold method, edge detection method, clustering algorithm based on color features and region-based image segmentation algorithm, are discussed, the object recognition algorithm based on depth learning and the target fruit location are analyzed and compared, and the state-of-the-art of fruit picking robot manipulator and end-effector is summarized, finally, the development trend and direction of fruit-picking robot in the future are prospected, which can provide reference for the related research of fruit-picking robot.
    Reference | Related Articles | Metrics
    Research Progress of YOLO Series Target Detection Algorithms
    WANG Linyi, BAI Jing, LI Wenjing, JIANG Jinzhe
    Computer Engineering and Applications    2023, 59 (14): 15-29.   DOI: 10.3778/j.issn.1002-8331.2301-0081
    Abstract1392)      PDF(pc) (1009KB)(807)       Save
    The YOLO-based algorithm is one of the hot research directions in target detection. In recent years, with the continuous proposition of YOLO series algorithms and their improved models, the YOLO-based algorithm has achieved excellent results in the field of target detection and has been widely used in various fields in reality. This article first introduces the typical datasets and evaluation index for target detection and reviews the overall YOLO framework and the development of the target detection algorithm of YOLOv1~YOLOv7. Then, models and their performance are summarized across eight improvement directions, such as data augmentation, lightweight network construction, and IOU loss optimization, at the three stages of input, feature extraction, and prediction. Afterwards, the application fields of YOLO algorithm are introduced. Finally, combined with the actual problems of target detection, it summarizes and prospects the development direction of the YOLO-based algorithm.
    Reference | Related Articles | Metrics
    Survey of Transformer Research in Computer Vision
    LI Xiang, ZHANG Tao, ZHANG Zhe, WEI Hongyang, QIAN Yurong
    Computer Engineering and Applications    2023, 59 (1): 1-14.   DOI: 10.3778/j.issn.1002-8331.2204-0207
    Abstract1168)      PDF(pc) (1285KB)(760)       Save
    Transformer is a deep neural network based on self-attention mechanism. In recent years, Transformer-based models have become a hot research direction in the field of computer vision, and their structures are constantly being improved and expanded, such as local attention mechanisms, pyramid structures, and so on. Through the improved vision model based on Transformer structure, the performance optimization and structure improvement are reviewed and summarized respectively. In addition,the advantages and disadvantages of the respective structures of the Transformer and convolutional neural network(CNN) are compared and analyzed,and a new hybrid structure of CNN+Transformer is introduced. Finally,the development of Transformer in computer vision is summarized and prospected.
    Reference | Related Articles | Metrics
    Study on Optimization of Cooperative Distribution Path Between UAVs and Vehicles Under Rural E-Commerce Logistics
    XU Ling, YANG Linchao, ZHU Wenxing, ZHONG Shaojun
    Computer Engineering and Applications    2024, 60 (1): 310-318.   DOI: 10.3778/j.issn.1002-8331.2306-0115
    Abstract845)      PDF(pc) (666KB)(706)       Save
    Drone delivery has emerged as a significant solution to address the challenges of last-mile logistics. The collaborative delivery model between drones and vehicles overcomes the limitations of insufficient drone delivery capacity and enhances safety, making it a vital approach for drone involvement in the delivery process. To tackle the difficulties and high costs associated with “last-mile” delivery in rural e-commerce logistics, this study constructs a mixed-integer programming model. The objective is to minimize delivery costs while considering constraints such as the collaborative drone-vehicle mode and multi drone multi-parcel delivery. A two-stage algorithm is proposed to optimize the paths for drone-vehicle collaborative delivery. In the first stage, a constrained adaptive K-means algorithm is utilized to determine the range of vehicle docking points. In the second stage, an improved genetic algorithm that incorporates hill climbing and splitting operators is employed to identify the optimal delivery paths for drones and vehicles. Subsequently, a case study experiment is conducted to validate the feasibility and effectiveness of the model and algorithm. The research findings are expected to offer novel insights and valuable references for cost reduction and efficiency improvement in last-mile delivery for rural e-commerce logistics.
    Reference | Related Articles | Metrics
    Survey of Transformer-Based Object Detection Algorithms
    LI Jian, DU Jianqiang, ZHU Yanchen, GUO Yongkun
    Computer Engineering and Applications    2023, 59 (10): 48-64.   DOI: 10.3778/j.issn.1002-8331.2211-0133
    Abstract1164)      PDF(pc) (875KB)(701)       Save
    Transformer is a kind of deep learning framework with strong modeling and parallel computing capabilities. At present, object detection algorithm based on Transformer has become a hotspot. In order to further explore new ideas and directions, this paper summarizes the existing object detection algorithm based on Transformer as well as a variety of object detection data sets and their application scenarios. This paper describes the correlation algorithms for Transformer based object detection from four aspects, i.e. feature extraction, object estimation, label matching policy and application of algorithm, compares the Transformer algorithm with the object detection algorithm based on convolutional neural network, analyzes the advantages and disadvantages of Transformer in object detection task, and proposes a general framework for Transformer based object detection model. Finally, the prospect of development trend of Transformer in the field of object detection is put forward.
    Reference | Related Articles | Metrics
    Overview of Multi-Agent Path Finding
    LIU Zhifei, CAO Lei, LAI Jun, CHEN Xiliang, CHEN Ying
    Computer Engineering and Applications    2022, 58 (20): 43-64.   DOI: 10.3778/j.issn.1002-8331.2203-0467
    Abstract1462)      PDF(pc) (1013KB)(679)       Save
    The multi-agent path finding(MAPF) problem is the fundamental problem of planning paths for multiple agents, where the key constraint is that the agents will be able to follow these paths concurrently without colliding with each other. MAPF is widely used in logistics, military, security and other fields. MAPF algorithm can be divided into the centralized planning algorithm and the distributed execution algorithm when the main research results of MAPF at home and abroad are systematically sorted and classified according to different planning methods. The centralized programming algorithm is not only the most classical but also the most commonly used MAPF algorithm. It is mainly divided into four algorithms based on [A*] search, conflict search, cost growth tree and protocol. The other part of MAPF which is the distributed execution algorithm is based on reinforcement learning. According to different improved techniques, the distributed execution algorithm can be divided into three types:the expert demonstration, the improved communication and the task decomposition. The challenges of existing algorithms are pointed out and the future work is forecasted based on the above classification by comparing the characteristics and applicability of MAPF algorithms and analyzing the advantages and disadvantages of existing algorithms.
    Reference | Related Articles | Metrics
    Review on Research and Application of Deep Learning-Based Target Detection Algorithms
    ZHANG Yangting, HUANG Deqi, WANG Dongwei, HE Jiajia
    Computer Engineering and Applications    2023, 59 (18): 1-13.   DOI: 10.3778/j.issn.1002-8331.2305-0310
    Abstract1093)      PDF(pc) (662KB)(678)       Save
    With the continuous development of deep learning, deep convolutional neural networks are increasingly used in the field of target detection and are now applied in many fields such as agriculture, transportation, and medicine. Compared with traditional feature-based manual methods, deep learning-based target detection methods can learn both low-level and high-level image features with better detection accuracy and generalization ability. To outline and summarize the latest advances and technologies in the field of target detection, the status of deep learning-based target detection algorithms and applications is reviewed by analyzing the deep learning-based target detection technologies in recent years. Firstly, the development, advantages and disadvantages of two kinds of target detection network architectures, two phases and single phase, are summarized; secondly, the backbone network, data set and evaluation metrics are described, the detection accuracy of classical algorithms are compared, and the improvement strategies of classical target detection algorithms are summarized; finally, the current stage of target detection applications are discussed, and future research priorities in the field of target detection are proposed.
    Reference | Related Articles | Metrics
    Survey on Image Semantic Segmentation in Dilemma of Few-Shot
    WEI Ting, LI Xinlei, LIU Hui
    Computer Engineering and Applications    2023, 59 (2): 1-11.   DOI: 10.3778/j.issn.1002-8331.2205-0496
    Abstract893)      PDF(pc) (4301KB)(676)       Save
    In recent years, image semantic segmentation has developed rapidly due to the emergence of large-scale datasets. However, in practical applications, it is not easy to obtain large-scale, high-quality images, and image annotation also consumes a lot of manpower and time costs. In order to get rid of the dependence on the number of samples, few-shot semantic segmentation has gradually become a research hotspot. The current few-shot semantic segmentation methods mainly use the idea of meta-learning, which can be divided into three categories:based on the siamese neural network, based on the prototype network and based on the attention mechanism according to different model structures. Based on the current research, this paper introduces the development, advantages and disadvantages of various methods for few-shot semantic segmentation, as well as common datasets and experimental designs. On this basis, the application scenarios and future development directions are summarized.
    Reference | Related Articles | Metrics
    Review of Research on Road Traffic Flow Data Prediciton Methods
    MENG Chuang, WANG Hui, LIN Hao, LI Kecen, WANG Xinpeng
    Computer Engineering and Applications    2023, 59 (14): 51-61.   DOI: 10.3778/j.issn.1002-8331.2209-0458
    Abstract1295)      PDF(pc) (605KB)(627)       Save
    As an important branch of intelligent transportation system, road traffic flow prediction plays an important role in congestion prediction, path planning. The spatio-temporal polymorphism and complex correlation of road traffic flow data force the transformation and upgrading of road traffic flow prediction methods in the era of big data. In order to mine the time-space characteristics of traffic flow, scholars have proposed various methods, including model fusion, model algorithm improvement, data definition conversion, etc, in order to improve the prediction accuracy of the model. In order to reasonably summarize all kinds of traffic flow prediction methods, they are divided into three categories according to the types of methods used:statistics based methods, machine learning based methods, and depth learning based methods. This paper summarizes and analyzes the new models and algorithms in recent years by summarizing various traffic flow prediction methods, aiming to provide research ideas for relevant researchers. Finally, the methods of traffic flow prediction are summarized and prospected, and the exploration direction of the future traffic flow prediction field is given.
    Reference | Related Articles | Metrics
    Survey of Research on Deep Multimodal Representation Learning
    PAN Mengzhu, LI Qianmu, QIU Tian
    Computer Engineering and Applications    2023, 59 (2): 48-64.   DOI: 10.3778/j.issn.1002-8331.2206-0145
    Abstract869)      PDF(pc) (6521KB)(613)       Save
    Although deep learning has been widely used in many fields because of its powerful nonlinear representation capabilities, the structural and semantic gap between multi-source heterogeneous modal data seriously hinders the application of subsequent deep learning models. Many scholars have proposed a large number of representation learning methods to explore the correlation and complementarity between different modalities, and improve the performance of deep learning prediction and generalization. However, the research on multimodal representation learning is still in its infancy, and there are still many scientific problems to be solved. So far, multimodal representation learning still lacks a unified cognition, and the architecture and evaluation metrics of multimodal representation learning research are not fully clear. According to the feature structure, semantic information and representation ability of different modalities, this paper studies and analyzes the progress of deep multimodal representation learning from the perspectives of representation fusion and representation alignment. And the existing research work is systematically summarized and scientifically classified. At the same time, this paper analyzes the basic structure, application scenarios and key issues of representative frameworks and models, analyzes the theoretical basis and latest development of deep multimodal representation learning, and points out the current challenges and future development of multimodal representation learning research, to further promote the development and application of deep multimodal representation learning.
    Reference | Related Articles | Metrics
    Review of Explainable Artificial Intelligence
    ZHAO Yanyu, ZHAO Xiaoyong, WANG Lei, WANG Ningning
    Computer Engineering and Applications    2023, 59 (14): 1-14.   DOI: 10.3778/j.issn.1002-8331.2208-0322
    Abstract925)      PDF(pc) (683KB)(579)       Save
    With the development of machine learning and deep learning, artificial intelligence technology has been gradually applied in various fields. However, one of the biggest drawbacks of adopting AI is its inability to explain the basis for predictions. The black-box nature of the models makes it impossible for humans to truly trust them yet in mission-critical application scenarios such as healthcare, finance, and autonomous driving, thus limiting the grounded application of AI in these areas. Driving the development of explainable artificial intelligence(XAI) has become an important issue for achieving mission-critical applications on the ground. At present, there is still a lack of research reviews on XAI in related fields at home and abroad, as well as a lack of studies focusing on causal explanation methods and the evaluation of explainable methods. Therefore, this study firstly starts from the characteristics of explanatory methods and divides the main explainable methods into three categories:model-independent methods, model-dependent methods, and causal explanation methods from the perspective of explanation types, and summarizes and analyzes them respectively, then summarizes the evaluation of explanation methods, lists the applications of explainable AI, and finally discusses the current problems of explainability and provides an outlook.
    Reference | Related Articles | Metrics
    Review of SLAM Based on Lidar
    LIU Mingzhe, XU Guanghui, TANG Tang, QIAN Xiaojian, GENG Ming
    Computer Engineering and Applications    2024, 60 (1): 1-14.   DOI: 10.3778/j.issn.1002-8331.2308-0455
    Abstract868)      PDF(pc) (854KB)(575)       Save
    Simultaneous localization and mapping (SLAM) is a crucial technology for autonomous mobile robots and autonomous driving systems, with a laser scanner (also known as lidar) playing a vital role as a supporting sensor for SLAM algorithms. This article provides a comprehensive review of lidar-based SLAM algorithms. Firstly, it introduces the overall framework of lidar-based SLAM, providing detailed explanations of the functions of the front-end odometry, back-end optimization, loop closure detection, and map building modules, along with a summary of the algorithms used. Secondly, it presents descriptions and summaries of representative open-source algorithms in a sequential order of 2D to 3D and single-sensor to multi-sensor fusion. Additionally, it discusses commonly used open-source datasets, precision evaluation metrics, and evaluation tools. Lastly, it offers an outlook on the development trends of lidar-based SLAM technology from four dimensions: deep learning, multi-sensor fusion, multi-robot collaboration, and robustness research.
    Reference | Related Articles | Metrics
    Overview of Smoke and Fire Detection Algorithms Based on Deep Learning
    ZHU Yuhua, SI Yiyi, LI Zhihui
    Computer Engineering and Applications    2022, 58 (23): 1-11.   DOI: 10.3778/j.issn.1002-8331.2206-0154
    Abstract1044)      PDF(pc) (782KB)(544)       Save
    Among various disasters, fire is one of the main disasters that most often and universally threaten public safety and social development. With the rapid development of economic construction and the increasing size of cities, the number of major fire hazards has increased dramatically. However, the widely used smoke sensor method of fire detection is vulnerable to factors such as distance, resulting in untimely detection. The introduction of video surveillance systems has provided new ideas to solve this problem. Traditional image processing algorithms based on video are earlier proposed methods, and the recent rapid development of machine vision and image processing technologies has resulted in a series of methods using deep learning techniques to automatically detect fires in video and images, which have very important practical applications in the field of fire safety. In order to comprehensively analyze the improvements and applications related to deep learning methods for fire detection, this paper first briefly introduces the fire detection process based on deep learning, and then focuses on a detailed comparative analysis of deep methods for fire detection in three granularities:classification, detection, and segmentation, and elaborates the relevant improvements taken by each class of algorithms for existing problems. Finally, the problems of fire detection at the present stage are summarized and future research directions are proposed.
    Reference | Related Articles | Metrics
    Review of Deep Reinforcement Learning Model Research on Vehicle Routing Problems
    YANG Xiaoxiao, KE Lin, CHEN Zhibin
    Computer Engineering and Applications    2023, 59 (5): 1-13.   DOI: 10.3778/j.issn.1002-8331.2210-0153
    Abstract998)      PDF(pc) (1036KB)(542)       Save
    Vehicle routing problem(VRP) is a classic NP-hard problem, which is widely used in transportation, logistics and other fields. With the scale of problem and dynamic factor increasing, the traditional method of solving the VRP is challenged in computational speed and intelligence. In recent years, with the rapid development of artificial intelligence technology, in particular, the successful application of reinforcement learning in AlphaGo provides a new idea for solving routing problems. In view of this, this paper mainly summarizes the recent literature using deep reinforcement learning to solve VRP and its variants. Firstly, it reviews the relevant principles of DRL to solve VRP and sort out the key steps of DRL-based to solve VRP. Then it systematically classifies and summarizes the pointer network, graph neural network, Transformer and hybrid models four types of solving methods, meanwhile this paper also compares and analyzes the current DRL-based model performance in solving VRP and its variants. Finally, this paper sums up the challenge of DRL-based to solve VRP and future research directions.
    Reference | Related Articles | Metrics
    Overview of Image Edge Detection
    XIAO Yang, ZHOU Jun
    Computer Engineering and Applications    2023, 59 (5): 40-54.   DOI: 10.3778/j.issn.1002-8331.2209-0122
    Abstract1021)      PDF(pc) (921KB)(540)       Save
    The task of edge detection is to identify pixels with significant brightness changes as target edges, which is a low-level problem in computer vision, and edge detection has important applications in object recognition and detection, object proposal generation, and image segmentation. Nowadays, edge detection has produced several types of methods, such as traditional gradient-based detection methods and deep learning-based edge detection algorithms and detection methods combined with emerging technologies. A finer classification of these methods provides researchers with a clearer understanding of the trends in edge detection. Firstly, the theoretical basis and implementation methods of traditional edge detection are introduced; then the main edge detection methods in recent years are summarized and classified according to the methods used, and the core techniques used in them are introduced, such as branching structure, feature fusion and loss function. The evaluation indicators used to assess the algorithm’s performance are single-image optimal threshold(ODS) and frame per second(FPS), which are contrasted using the fundamental data set(BSDS500). Finally, the current state of edge detection research is examined and summarized, and the possible future research directions of edge detection are prospected.
    Reference | Related Articles | Metrics
    Research on Urban Logistics Distribution Mode of Bus-Assisted Drones
    PENG Yong, REN Zhi
    Computer Engineering and Applications    2024, 60 (7): 335-343.   DOI: 10.3778/j.issn.1002-8331.2212-0252
    Abstract649)      PDF(pc) (755KB)(537)       Save
    The rapid development of e-commerce forces the continuous transformation and upgrading of the logistics industry. In view of the fact that local governments encourage the development of public transport and advocate green and low-carbon logistics distribution mode, a distribution mode of bus-assisted drone is studied. After explaining the problem, a mathematical model with the lowest distribution cost is constructed, and a heuristic algorithm of smart general variable neighborhood search metaheuristic is designed to solve the problem. At the same time, in order to improve the efficiency of the algorithm, K-means clustering and greedy algorithm are introduced to generate the initial solution. Firstly, aiming at different scale examples, a variety of local search strategies and a variety of algorithms are compared to verify the effectiveness of the algorithm. Secondly, by selecting the standard CVRP as example, the single truck distribution mode and truck-drone collaborative distribution mode are compared with the distribution mode of bus-assisted drone to prove its cost and time advantages. Finally, Beijing Bus Rapid Transit Line 2 and its surrounding customer points are selected, and sensitivity analysis is made by changing the bus stop spacing and departure interval, result shows that the impact of increasing the stop spacing is greater than the change of departure interval.
    Reference | Related Articles | Metrics
    Small Object Detection Algorithm Based on Improved YOLOv5 in UAV Image
    XIE Chunhui, WU Jinming, XU Huaiyu
    Computer Engineering and Applications    2023, 59 (9): 198-206.   DOI: 10.3778/j.issn.1002-8331.2212-0336
    Abstract798)      PDF(pc) (808KB)(510)       Save
    UAV aerial images have many characteristics, such as large-scale changes and complex backgrounds, so it is difficult for the existing detectors to detect small objects in aerial images. Aiming at the problem of mistake detection and omission, a small object detection algorithm model Drone-YOLO is proposed. A new detection branch is added to improve the detection capability at multiple scales, meanwhile the model contains a novel feature pyramid network with multi-level information aggregation, which realizes the fusion of cross-layers information. Then a feature fusion module based on multi-scale channel attention mechanism is designed to improve the focus on small objects. The classification task of the prediction head is decoupled from the regression task, and the loss function is optimized using Alpha-IoU to improve the accuracy of detection. The experimental results of VisDrone dataset show that the Drone-YOLO has improved the AP50 by 4.91?percentage points compared with the YOLOv5, and the inference time is only 16.78?ms. Compared with other mainstream models, it has a better detection effect for small targets, and can effectively complete the task of small target detection in UAV aerial images.
    Reference | Related Articles | Metrics
    Survey of Camera Pose Estimation Methods Based on Deep Learning
    WANG Jing, JIN Yuchu, GUO Ping, HU Shaoyi
    Computer Engineering and Applications    2023, 59 (7): 1-14.   DOI: 10.3778/j.issn.1002-8331.2209-0280
    Abstract1054)      PDF(pc) (702KB)(505)       Save
    Camera pose estimation is a technology to accurately estimate the 6-DOF position and pose of camera in world coordinate system under known environment. It is a key technology in robotics and automatic driving. With the rapid development of deep learning, using deep learning to optimize camera pose estimation algorithm has become one of the current research hotspots. In order to master the current research status and trends of camera pose estimation algorithms, the mainstream algorithms based on deep learning are summarized. Firstly, the traditional camera pose estimation methods based on feature points is briefly introduced. Then, the camera pose estimation method based on deep learning is mainly introduced. According to the different core algorithms, the end-to-end camera pose estimation, scene coordinate regression, camera pose estimation based on retrieval, hierarchical structure, multi-information fusion and cross scenescamera pose estimation are elaborated and analyzed in detail. Finally, this paper summarizes the current research status, points out the challenges in the field of camera pose estimation based on in-depth performance analysis, and prospects the development trend of camera pose estimation.
    Reference | Related Articles | Metrics
    Research Progress in Application of Graph Anomaly Detection in Financial Anti-Fraud
    LIU Hualing, LIU Yaxin, XU Junyi, CHEN Shanghui, QIAO Liang
    Computer Engineering and Applications    2022, 58 (22): 41-53.   DOI: 10.3778/j.issn.1002-8331.2203-0233
    Abstract720)      PDF(pc) (1848KB)(503)       Save
    With the rapid development of digital finance, fraud presents new characteristics such as intellectualization, industrialization and strong concealment. And the limitations of traditional expert rules and machine learning methods are increa-
    singly apparent. Graph anomaly detection technology has a strong ability to deal with associated information, which provides new idea for financial anti-fraud. Firstly, the development and advantages of graph anomaly detection are briefly introduced. Secondly, from the perspectives of individual anti-fraud and group anti-fraud, graph anomaly detection technology is divided into individual fraud detections based on feature, proximity, graph representation learning or community division, and gang fraud detections based on dense subgraph, dense subtensor or deep network structure. The basic idea, advantages, disadvantages, research progress and typical applications of each anomaly detection technology are analyzed and compared. Finally, the common test data sets and evaluation criteria are summarized, and the development prospect and research direction of graph anomaly detection technology in financial anti-fraud are given.
    Reference | Related Articles | Metrics
    Review of Visual Odometry Methods Based on Deep Learning
    ZHI Henghui, YIN Chenyang, LI Huibin
    Computer Engineering and Applications    2022, 58 (20): 1-15.   DOI: 10.3778/j.issn.1002-8331.2203-0480
    Abstract906)      PDF(pc) (904KB)(493)       Save
    Visual odometry(VO) is a common method to deal with the positioning of mobile devices equipped with vision sensors, and has been widely used in autonomous driving, mobile robots, AR/VR and other fields. Compared with traditional model-based methods, deep learning-based methods can learn efficient and robust feature representations from data without explicit computation, thereby improving their ability to handle challenging scenes such as illumination changes and less textures. In this paper, it first briefly reviews the model-based visual odometry methods, and then focuses on six aspects of deep learning-based visual odometry methods, including supervised learning methods, unsupervised learning methods, model-learning fusion methods, common datasets, evaluation metrics, and comparison of models and deep learning methods. Finally, existing problems and future development trends of deep learning-based visual odometry are discussed.
    Reference | Related Articles | Metrics
    Multi-Modal Meteorological Forecasting Based on Transformer
    XIANG Deping, ZHANG Pu, XIANG Shiming, PAN Chunhong
    Computer Engineering and Applications    2023, 59 (10): 94-103.   DOI: 10.3778/j.issn.1002-8331.2208-0486
    Abstract726)      PDF(pc) (977KB)(485)       Save
    Thanks to the rapid development of meteorological observation technology, the meteorological industry has accumulated massive meteorological data, which provides an opportunity to build new data-driven meteorological forecasting methods. Due to the long-term dependence and large-scale spatial correlation hidden in meteorological data, and due to the complex coupling relationship between different modalities, meteorological forecasting with deep learning is still a challenging research topic. This paper presents a deep learning model for meteorological forecasting based on multi-modal fusion, using sequential multi-modal data in same atmospheric pressure levels composed of four classical meteorological elements:temperature, relative humidity, U-component of wind and V-component of wind. Specifically, convolutional network is used to learn features from every modality, and with those features, the gating mechanism is introduced to multi-modal weighted fusion. Secondly, the attention mechanism is introduced, which replaces the traditional attention mechanism with parallel spatial-temporal axial attention, in order to effectively learn long-term dependencies and large-scale spatial associations. Architecturally, the Transformer encoder-decoder structure is employed as the overall framework. Extensive comparative experiments have been conducted on the regional ERA5 reanalysis dataset, demonstrating that the proposed method is effective and superior in the prediction of temperature, relative humidity and wind.
    Reference | Related Articles | Metrics
    Construction and Application of Discipline Knowledge Graph in Personalized Learning
    ZHAO Yubo, ZHANG Liping, YAN Sheng, HOU Min, GAO Mao
    Computer Engineering and Applications    2023, 59 (10): 1-21.   DOI: 10.3778/j.issn.1002-8331.2209-0345
    Abstract766)      PDF(pc) (929KB)(480)       Save
    The discipline knowledge graph is an important tool to support teaching activities based on big data, artificial intelligence and other technologies. As a kind of discipline knowledge semantic network, it contributes to the development of personalized learning systems and the promotion of new infrastructure for digital education resources. Firstly, this paper outlines the concept and classification of knowledge graph. Secondly, this paper summarizes the concept, characteristics, advantages, connotation and the support for personalized learning of discipline knowledge graph. Nextly, this paper focuses on the sorting of construction process of discipline knowledge graph:discipline ontology construction, discipline knowledge extraction, discipline knowledge fusion and discipline knowledge processing, and it also introduces the application of discipline knowledge graph in personalized learning situations and the challenges. Finally, this paper prospects the future tendency of discipline knowledge graph and personalized learning. It provides the reference and inspiration for the organization of educational resources and the innovative development of personalized learning.
    Reference | Related Articles | Metrics
    Cross-Social Network User Matching Based on User Check-in
    DAI Jun, MA Qiang
    Computer Engineering and Applications    2023, 59 (2): 76-84.   DOI: 10.3778/j.issn.1002-8331.2203-0581
    Abstract113)      PDF(pc) (28513KB)(475)       Save
    Cross-social network user matching technology can integrate multi-platform user data to realize more diverse applications. Existing research on social network user matching based on check-in ignores the imbalance of multi-source social network check-in data, which leads to a decrease of matching accuracy under real datasets. Aiming at this problem, this paper proposes a cross-social network user matching method based on user check-in. Firstly, the user check-in data is coarse-grained and filtered through grid clustering algorithm, and the check-in data with strong potential correlation is selected; then the spatiotemporal features are extracted from the check-in data, and the similarity of different attributes is calculated; finally, by optimizing the multi-attribute weight distribution of similarity, comprehensive calculation of user matching score is conducted. Experimental results on multiple datasets demonstrate the effectiveness of the proposed method in the case of unbalanced check-in data.
    Reference | Related Articles | Metrics
    Survey of Sentiment Analysis Algorithms Based on Multimodal Fusion
    GUO Xu, Mairidan Wushouer, Gulanbaier Tuerhong
    Computer Engineering and Applications    2024, 60 (2): 1-18.   DOI: 10.3778/j.issn.1002-8331.2305-0439
    Abstract626)      PDF(pc) (954KB)(455)       Save
    Sentiment analysis is an emerging technology that aims to explore people’s attitudes toward entities and can be applied to various domains and scenarios, such as product evaluation analysis, public opinion analysis, mental health analysis and risk assessment. Traditional sentiment analysis models focus on text content, yet some special forms of expression, such as sarcasm and hyperbole, are difficult to detect through text. As technology continues to advance, people can now express their opinions and feelings through multiple channels such as audio, images and videos, so sentiment analysis is shifting to multimodality, which brings new opportunities for sentiment analysis. Multimodal sentiment analysis contains rich visual and auditory information in addition to textual information, and the implied sentiment polarity (positive, neutral, negative) can be inferred more accurately using fusion analysis. The main challenge of multimodal sentiment analysis is the integration of cross-modal sentiment information; therefore, this paper focuses on the framework and characteristics of different fusion methods and describes the popular fusion algorithms in recent years, and discusses the current multimodal sentiment analysis in small sample scenarios, in addition to the current development status, common datasets, feature extraction algorithms, application areas and challenges. It is expected that this review will help researchers understand the current state of research in the field of multimodal sentiment analysis and be inspired to develop more effective models.
    Reference | Related Articles | Metrics
    Image Inpainting Algorithm Based on Deep Neural Networks
    LYU Jianfeng, SHAO Lizhen, LEI Xuemei
    Computer Engineering and Applications    2023, 59 (20): 1-12.   DOI: 10.3778/j.issn.1002-8331.2303-0111
    Abstract443)      PDF(pc) (720KB)(450)       Save
    With the rapid development of deep learning, computer vision technology is applied more and more widely. At the same time, the image inpainting technology based on the known information of the damaged image using deep neural network has also become a hot topic. The image inpainting methods based on depth neural network in recent years are reviewed and analyzed. Firstly, the image inpainting methods are classified and summarized according to the view of model optimization. Then the common datasets and performance evaluation indicators are introduced, and the performance evaluation and analysis of various deep neural network-based image inpainting algorithms are carried out on the relevant data sets. Finally, the challenges faced by the existing image inpainting methods are analyzed, and the future research works are prospected.
    Reference | Related Articles | Metrics
    Improved YOLOv8s Model for Small Object Detection from Perspective of Drones
    PAN Wei, WEI Chao, QIAN Chunyu, YANG Zhe
    Computer Engineering and Applications    2024, 60 (9): 142-150.   DOI: 10.3778/j.issn.1002-8331.2312-0043
    Abstract314)      PDF(pc) (5858KB)(440)       Save
    Facing with the problems of small and densely distributed image targets, uneven class distribution, and model size limitation of hardware conditions, object detection from the perspective of drones has less precise results. A new improved model based on YOLOv8s with multiple attention mechanisms is proposed. To solve the problem of shared attention weight parameters in receptive field features and enhance feature extraction ability, receptive field attention convolution and CBAM (concentration based attention module) attention mechanism are introduced into the backbone, adding attention weight in channel and spatial dimensions. By introducing large separable kernel attention into feature pyramid pooling layers, information fusion between different levels of features is increased. The feature layers with rich semantic information of small targets are added to improve the neck structure. The inner-IoU loss function is used to improve the MPDIoU (minimum point distance based IoU) function and the inner-MPDIoU instead of the original loss function is used to enhance the learning ability for difficult samples. The experimental results show that the improved YOLOv8s model has improved mAP, P, and R by 16.1%, 9.3%, and 14.9% respectively on the VisDrone dataset, surpassing YOLOv8m in performance and can be effectively applied to unmanned aerial vehicle visual detection tasks.
    Reference | Related Articles | Metrics
    Review of Research on Driver Fatigue Driving Detection Methods
    ZHANG Rui, ZHU Tianjun, ZOU Zhiliang, SONG Rui
    Computer Engineering and Applications    2022, 58 (21): 53-66.   DOI: 10.3778/j.issn.1002-8331.2204-0053
    Abstract931)      PDF(pc) (946KB)(423)       Save
    The proportion of traffic accidents caused by fatigue driving has increased year by year, which has attracted widespread attention from researchers. At present, the research of fatigue driving testing is limited by various factors such as scientific and technological level, environment, and road, which makes it difficult to further develop fatigue driving detection technology. This article introduces the latest progress in driver fatigue driving detection methods in the past decade. The two categories of active detection method and passive detection method are elaborated and reviewed. According to the different characteristics of the two major types of detection methods, it is carefully classified. The advantages and limitations of various fatigue driving detection methods are further analyzed, and the detection algorithms used in the active detection method based on facial features in the past three years are analyzed and summarized. Finally, the shortcomings of various fatigue driving detection methods are summarized, and the future research trends in the field of fatigue detection are proposed, which provides new ideas for researchers to further research.
    Reference | Related Articles | Metrics
    Review of Cross-Domain Object Detection Algorithms Based on Depth Domain Adaptation
    LIU Hualing, PI Changpeng, ZHAO Chenyu, QIAO Liang
    Computer Engineering and Applications    2023, 59 (8): 1-12.   DOI: 10.3778/j.issn.1002-8331.2210-0063
    Abstract576)      PDF(pc) (583KB)(392)       Save
    In recent years, the object detection algorithm based on deep learning has attracted wide attention due to its high detection performance. It has been successfully applied in many fields such as automatic driving and human-computer interaction and has achieved certain achievements. However, traditional deep learning methods are based on the assumption that the training set (source domain) and the test set (target domain) follow the same distribution, but this assumption is not realistic, which severely reduces the generalization performance of the model. How to align the distribution of the source domain and the target domain so as to improve the generalization of the object detection model has become a research hotspot in the past two years. This article reviews cross-domain object detection algorithms. First, it introduces the preliminary knowledge of cross-domain object detection:depth domain adaptation and object detection. The cross-domain object detection is decomposed into two small areas for an overview, in order to understand its development from the bottom logic. In turn, this article introduces the latest developments in cross-domain object detection algorithms, from the perspectives of differences, confrontation, reconstruction, hybrid and other five categories, and sorts out the research context of each category. Finally, this article summarizes and looks forward to the development trend of cross-domain object detection algorithms.
    Reference | Related Articles | Metrics
    Review on Human Action Recognition Methods Based on Multimodal Data
    WANG Cailing, YAN Jingjing, ZHANG Zhidong
    Computer Engineering and Applications    2024, 60 (9): 1-18.   DOI: 10.3778/j.issn.1002-8331.2310-0090
    Abstract246)      PDF(pc) (8541KB)(380)       Save
    Human action recognition (HAR) is widely applied in the fields of intelligent security, autonomous driving and human-computer interaction. With advances in capture equipment and sensor technology, the data that can be acquired for HAR is no longer limited to RGB data, but also multimodal data such as depth, skeleton, and infrared data. Feature extraction methods in HAR based on RGB and skeleton data modalities are introduced in detail, including handcrafted-based and deep learning-based methods. For RGB data modalities, feature extraction algorithms based on two-stream convolutional neural network (2s-CNN), 3D convolutional neural network (3DCNN) and hybrid network are analyzed. For skeleton data modalities, some popular pose estimation algorithms for single and multi-person are firstly introduced. The classification algorithms based on convolutional neural network (CNN), recurrent neural network (RNN), and graph convolutional neural network (GCN) are analyzed stressfully. A further comprehensive demonstration of the common datasets for both data modalities is presented. In addition, the current challenges are explored based on the corresponding data structure features of RGB and skeleton. Finally, future research directions for deep learning-based HAR methods are discussed.
    Reference | Related Articles | Metrics
    Survey on Credit Card Transaction Fraud Detection Based on Machine Learning
    JIANG Hongxun, JIANG Junyi, LIANG Xun
    Computer Engineering and Applications    2023, 59 (21): 1-25.   DOI: 10.3778/j.issn.1002-8331.2302-0129
    Abstract603)      PDF(pc) (674KB)(373)       Save
    Machine learning has its distinctiveness in credit card transaction detection and faces a more complex environment. Since the intervention of human intelligence, machine learning encounters harder challenges in fraud detection than the ones of face recognition and driverlessness, which leads to failures if only applying the processes of engineering disciplines. This paper depicts the 2000-since research history of credit card anti-fraud; identifies the definition, scope, technical streams, applications, and other key concepts, and their interconnections in the field of detection oriented machine learning; analyzes the general architecture of fraud detection and summarizes the state-of-the-art of transaction fraud detection research in terms of feature engineering, models/algorithms, and evaluation metrics; discusses various detection algorithms of credit card transaction fraud and enumerates their original intention, core ideas, solution methods, advantages or disadvantages, and relevant extensions; highlights unsupervised, supervised, and semi-supervised learning models of fraud recognition, as well as various ensembles such as models cascading and aggregation; addresses three major challenges, i.e., massive data, sample skew, and concept drift, and compiles the latest progresses to alleviate these problems. This paper concludes with the limitations, controversies, and challenges of machine learning on credit card fraud recognition, and provides the trend analysis and suggestions for future research directions.
    Reference | Related Articles | Metrics
    Review of Object Detection Algorithm Improvement in Deep Learning
    YANG Feng, DING Zhitong, XING Mengmeng, DING Bo
    Computer Engineering and Applications    2023, 59 (11): 1-15.   DOI: 10.3778/j.issn.1002-8331.2209-0312
    Abstract522)      PDF(pc) (691KB)(364)       Save
    Object detection is currently a research hotspot in the field of computer vision. With the development of deep learning, object detection algorithms based on deep learning are increasingly applied and their performance is constantly improved. This paper summarizes the latest research progress of object detection methods based on deep learning by summarizing common problems encountered in the process of object detection and corresponding improvement methods. This paper focuses on two types of object detection algorithms based on deep learning. In addition, the latest improvement ideas of target detection algorithms are summarized from the aspects of attention mechanism, lightweight network, multi-scale detection. Finally, in view of the current problems in the field of target detection, the future development trend is prospected. And the feasible solution is put forward in order to provide reference ideas and directions for the follow-up research work in this field.
    Reference | Related Articles | Metrics
    Review of Research on Application of Vision Transformer in Medical Image Analysis
    SHI Lei, JI Qingyu, CHEN Qingwei, ZHAO Hengyi, ZHANG Junxing
    Computer Engineering and Applications    2023, 59 (8): 41-55.   DOI: 10.3778/j.issn.1002-8331.2206-0022
    Abstract589)      PDF(pc) (869KB)(364)       Save
    Deep self-attentive network(Transformer) has a natural ability to model global features and long-range correlations of input information, which is strongly complementary to the inductive bias property of convolutional neural networks(CNN). Inspired by its great success in natural language processing, Transformer has been widely introduced into various computer vision tasks, especially medical image analysis, and has achieved remarkable performance. In this paper, it first introduces the typical work of vision Transformer on natural images, and then organizes and summarizes the related work according to different lesions or organs in the subfields of medical image segmentation, medical image classification and medical image registration, focusing on the implementation ideas of some representative work. Finally, current researches are discussed and the future direction is pointed out. The purpose of this paper is to provide a reference for further in-depth research in this field.
    Reference | Related Articles | Metrics
    Computer Engineering and Applications    2022, 58 (22): 0-0.  
    Abstract195)      PDF(pc) (176505KB)(347)       Save
    Related Articles | Metrics
    Target Detection Algorithm of Remote Sensing Image Based on Improved YOLOv5
    LI Kunya, OU Ou, LIU Guangbin, YU Zefeng, LI Lin
    Computer Engineering and Applications    2023, 59 (9): 207-214.   DOI: 10.3778/j.issn.1002-8331.2209-0119
    Abstract485)      PDF(pc) (665KB)(337)       Save
    Aiming at the problems of low target detection accuracy caused by high background complexity, multiple target sizes and too many small targets in remote sensing images, this paper proposes a target detection algorithm of remote sensing image based on improved YOLOv5. The channel-global attention mechanism(CGAM) is introduced into the backbone network to enhance the feature extraction ability of targets at different scales and to suppress the interference of redundant information. The dense upsampling convolution(DUC) module is introduced to expand the low resolution convolution feature maps, which can effectively enhance the fusion effect of different convolution feature maps. The improved algorithm is applied to the open remote sensing data set RSOD, and the average accuracy AP value of the improved YOLOv5 algorithm reaches 78.5%, which is 3.1?percentage points higher than that of the original algorithm. Experimental results show that the improved algorithm can effectively improve the accuracy of remote sensing image target detection.
    Reference | Related Articles | Metrics
    CoT-TransUNet:Lightweight Context Transformer Medical Image Segmentation Network
    YANG He, BAI Zhengyao
    Computer Engineering and Applications    2023, 59 (3): 218-225.   DOI: 10.3778/j.issn.1002-8331.2205-0046
    Abstract501)      PDF(pc) (645KB)(330)       Save
    Aiming at the problem that the receptive field of convolution in the previous medical image segmentation network is too small and the feature loss of Transformer, an end-to-end lightweight context Transformer medicalimage segmentation network(lightweight context Transformer medical image segmentation network, CoT-TransUNet) is proposed. The network consists of three parts:encoder,decoder, and skip connections. For the input image,the encoder uses the CoTNet as a feature extractor to generate feature maps. Transformer blocks encode feature maps as input sequences. Then, the decoder upsamples the encoded features through a cascaded upsampler. The upsampler cascades multiple upsampling blocks, each of which employs the CARAFE upsampling operator. Finally, feature aggregation of the encoder and decoder at different resolutions is achieved through skip connections. CoT-TransUNet adopts CoTNet which combines global and local context information in the feature extraction stage. CARAFE operator with larger receptive field is adopted in the upsampling stage. It generates better input feature maps, as well as content-based upsampling, while remaining lightweight. Experiments on multi-organ segmentation tasks show that CoT-TransUNet achieves better performance than other networks.
    Reference | Related Articles | Metrics
    Survey on Emotion Recognition in Conversation
    CHEN Xiaoting, LI Shi
    Computer Engineering and Applications    2023, 59 (3): 33-48.   DOI: 10.3778/j.issn.1002-8331.2207-0417
    Abstract600)      PDF(pc) (681KB)(327)       Save
    Emotion recognition in conversation(ERC) is a hot research topic in the field of emotion computing, which aims to detect the emotion category of each discourse during the dialogue. It has important research significance for dialogue understanding and dialogue generation. At the same time, it has a wide range of practical application value in many fields, such as social media analysis, recommendation system, medical treatment and human-computer interaction. With the continuous innovation and development of deep learning technology, emotion recognition in conversation has attracted more and more attention from academia and industry. At this stage, it is necessary to summarize these research results in an overview article in order to better carry out follow-up work. The research results in this field are comprehensively sorted out from the perspectives of problem definition, problem approach, research methods, and mainstream datasets, and the development of dialogue emotion recognition tasks is reviewed and analyzed. Compared with video and audio, dialogue text contains more information. Therefore, this paper focuses on combing the text dialogue emotion recognition methods, especially the methods based on deep learning. Finally, based on the current research status, this paper summarizes the open problems existing in the field of dialogue emotion recognition and the development trend in the future.
    Reference | Related Articles | Metrics
    Review of Path Planning Algorithms for Robot Navigation
    CUI Wei, ZHU Fazheng
    Computer Engineering and Applications    2023, 59 (19): 10-20.   DOI: 10.3778/j.issn.1002-8331.2301-0088
    Abstract598)      PDF(pc) (595KB)(323)       Save
    Path planning is one of the key technologies for robot navigation. An excellent path planning algorithm can quickly find the best collision-free path and improve operational efficiency. Most existing classification methods have difficulty in expressing the differences and connections between algorithms. To distinguish different path planning algorithms more clearly, they are divided into graph-based search, bionic-based, potential field-based, velocity space-based and sampling-based algorithms based on their principle and nature. This paper introduces the concept, characteristics, and development status of each type of algorithm, analyzes the more widely used sample-based algorithms from the perspective of single-query and multi-query algorithms, and the advantages and problems of different types of path planning algorithms are compared and summarized. Finally, the future development trend of robot path planning algorithms in terms of multi-robot collaboration, multi-algorithm fusion and adaptive planning is prospected.
    Reference | Related Articles | Metrics
    Review of Recommendation Systems Using Knowledge Graph
    ZHANG Mingxing, ZHANG Xiaoxiong, LIU Shanshan, TIAN Hao, YANG Qinqin
    Computer Engineering and Applications    2023, 59 (4): 30-42.   DOI: 10.3778/j.issn.1002-8331.2209-0033
    Abstract508)      PDF(pc) (702KB)(318)       Save
    With the rapid development of the Internet, how to obtain the needed information from huge amounts of data becomes more important. The recommendation system is a method of screening information, which aims to recommend personalized content for users. However, traditional recommendation algorithms still suffer from several challenges, such as data sparsity and cold start. In recent years, researchers have used the rich entity and relationship information in the knowledge graph to alleviate the above problems. The overall performance of the recommendation system is enhanced. This paper gives a review of the recommendation system based on knowledge graph from three aspects:Firstly, basic concepts of the recommendation system and knowledge graph are introduced. The shortcomings of the existing recommendation algorithms are pointed out. Then, the research of the recommendation system based on knowledge graph is analyzed in detail. The advantages and challenges of the different approaches are assessed. Finally, relevant application scenarios and future development prospects are summarized.
    Reference | Related Articles | Metrics
    Algorithm for Real-Time Vehicle Detection from UAVs Based on Optimizing and Improving YOLOv8
    SHI Tao, CUI Jie, LI Song
    Computer Engineering and Applications    2024, 60 (9): 79-89.   DOI: 10.3778/j.issn.1002-8331.2312-0291
    Abstract242)      PDF(pc) (4614KB)(318)       Save
    To address the problems of low accuracy, easy interference from background environment and difficulty in detecting small target vehicles of existing UAV vehicle detection algorithms, an improved UAV vehicle detection algorithm YOLOv8-CX is proposed based on YOLOv8. By integrating the advantages of Deformable Convolutional Networks v1-3, a C2f-DCN module is proposed to flexibly sample features and better extract features between vehicles of different sizes. Utilizing the idea of large separable kernel attention, a SPPF-LSKA module is proposed with long-range dependency and self-adaptability, which can effectively reduce background interference on vehicle detection. In the neck network, a CF-FPN (ment network for tiny object deteciton) feature fusion structure is adopted to enhance the detection accuracy of small targets by combining contextual information and suppressing conflicts between features at different scales. Finally, the original YOLOv8 head is replaced with a Dynamic Head detection head. By unifying scale, space and task, the three types of attention mechanisms, the model detection performance is further improved. Experimental results show that on the Mapsai dataset, compared with the original algorithm, the improved algorithm increases the accuracy (P), recall (R) and mean average precision (mAP) by 8.5, 11.2 and 6.2 percentage points respectively, and the algorithm detection speed reaches 72.6 FPS, meeting the real-time requirements of UAV vehicle detection. By comparing with other mainstream target detection algorithms, the effectiveness and superiority of this method are validated.
    Reference | Related Articles | Metrics