Most Read articles

    Published in last 1 year |  In last 2 years |  In last 3 years |  All

    In last 3 years
    Please wait a minute...
    For Selected: Toggle Thumbnails
    Progress on Deep Reinforcement Learning in Path Planning
    ZHANG Rongxia, WU Changxu, SUN Tongchao, ZHAO Zengshun
    Computer Engineering and Applications    2021, 57 (19): 44-56.   DOI: 10.3778/j.issn.1002-8331.2104-0369
    Abstract2572)      PDF(pc) (1134KB)(1057)       Save

    The purpose of path planning is to allow the robot to avoid obstacles and quickly plan the shortest path during the movement. Having analyzed the advantages and disadvantages of the reinforcement learning based path planning algorithm, the paper derives a typical deep reinforcement learning, Deep Q-learning Network(DQN) algorithm that can perform excellent path planning in a complex dynamic environment. Firstly, the basic principles and limitations of the DQN algorithm are analyzed in depth, and the advantages and disadvantages of various DQN variant algorithms are compared from four aspects:the training algorithm, the neural network structure, the learning mechanism and AC(Actor-Critic) framework. The paper puts forward the current challenges and problems to be solved in the path planning method based on deep reinforcement learning. The future development directions are proposed, which can provide reference for the development of intelligent path planning and autonomous driving.

    Reference | Related Articles | Metrics
    Improved Lightweight Attention Model Based on CBAM
    FU Guodong, HUANG Jin, YANG Tao, ZHENG Siyu
    Computer Engineering and Applications    2021, 57 (20): 150-156.   DOI: 10.3778/j.issn.1002-8331.2101-0369
    Abstract2377)      PDF(pc) (808KB)(751)       Save

    In recent years, the attention model has been widely used in the field of computer vision. By adding the attention module to the convolutional neural network, the performance of the network can be significantly improved. However, most of the existing methods focus on the development of more complex attention modules to enable the convolutional neural network to obtain stronger feature expression capabilities, but this also inevitably increases the complexity of the model. In order to achieve a balance between performance and complexity, a lightweight EAM(Efficient Attention Module) model is proposed to optimize the CBAM model. For the channel attention module of CBAM, one-dimensional convolution is introduced to replace the fully connected layer to aggregate the channels. For the spatial attention module of CBAM, the large convolution kernel is replaced with a dilated convolution to increase the receptive field for aggregation Broader spatial context information. After integrating the model into YOLOv4 and testing it on the VOC2012 data set, mAP is increased by 3.48 percentage points. Experimental results show that the attention model only introduces a small amount of parameters, and the network performance can be greatly improved.

    Reference | Related Articles | Metrics
    Survey of Network Traffic Forecast Based on Deep Learning
    KANG Mengxuan, SONG Junping, FAN Pengfei, GAO Bowen, ZHOU Xu, LI Zhuo
    Computer Engineering and Applications    2021, 57 (10): 1-9.   DOI: 10.3778/j.issn.1002-8331.2101-0402
    Abstract1878)      PDF(pc) (711KB)(1698)       Save

    Precisely predicting the trend of network traffic changes can help operators accurately predict network usage, correctly allocate and efficiently use network resources to meet the growing and diverse user needs. Taking the progress of deep learning algorithms in the field of network traffic prediction as a clue, this paper firstly elaborates the evaluation indicators of network traffic prediction and the current public network traffic data sets. Secondly, this paper specifically analyzes four deep learning methods commonly used in network traffic prediction:deep belief networks, convolutional neural network, recurrent neural network, and long short term memory network, and focuses on the integrated neural network models used in recent years for different problems. The characteristics and application scenarios of each model are summarized. Finally, the future development of network traffic forecast is prospected.

    Related Articles | Metrics
    Review of Attention Mechanism in Convolutional Neural Networks
    ZHANG Chenjia, ZHU Lei, YU Lu
    Computer Engineering and Applications    2021, 57 (20): 64-72.   DOI: 10.3778/j.issn.1002-8331.2105-0135
    Abstract1796)      PDF(pc) (973KB)(1166)       Save

    Attention mechanism is widely used in deep learning tasks because of its excellent effect and plug and play convenience. This paper mainly focuses on convolution neural network, introduces various mainstream methods in the development process of convolution network attention mechanism, extracts and summarizes its core idea and implementation process, realizes each attention mechanism method, and makes comparative experiments and results analysis on the measured data of the same type of emitter equipment. According to the main ideas and experimental results, the research status and future development direction of attention mechanism in convolutional networks are summarized.

    Reference | Related Articles | Metrics
    Review of Development and Application of Artificial Neural Network Models
    ZHANG Chi, GUO Yuan, LI Ming
    Computer Engineering and Applications    2021, 57 (11): 57-69.   DOI: 10.3778/j.issn.1002-8331.2102-0256
    Abstract1778)      PDF(pc) (781KB)(1692)       Save

    Artificial neural networks are increasingly closely related to other subject areas. People solve problems in various fields by exploring and improving the layer structure of artificial neural networks. Based on the analysis of artificial neural networks related literature, this paper summarizes the history of artificial neural network growth and presents relevant principles of artificial neural networks based on the development of neural networks, including multilayer perceptron, back-propagation algorithm, convolutional neural network and recurrent neural network, explains the classic convolutional neural network model in the development of the convolutional neural network and the widely used variant network structure in the recurrent neural network, reviews the application of each artificial neural network algorithm in related fields, summarizes the possible direction of development of the artificial neural network.

    Related Articles | Metrics
    Survey of Multimodal Data Fusion
    REN Zeyu, WANG Zhenchao, KE Zunwang, LI Zhe, Wushour·Silamu
    Computer Engineering and Applications    2021, 57 (18): 49-64.   DOI: 10.3778/j.issn.1002-8331.2104-0237
    Abstract1597)      PDF(pc) (1214KB)(1734)       Save

    With the rapid development of information technology, information exists in various forms and sources. Different forms of existence or information sources can be referred to as one modal, and data composed of two or more modalities is called multi-modal data. Multi-modal data fusion is responsible for effectively integrating the information of multiple modalities, absorbing the advantages of different modalities, and completing the integration of information. Natural phenomena have very rich characteristics, and it is difficult for a single mode to provide complete information about a certain phenomenon. Faced with the fusion requirements of maintaining the diversity and completeness of the modal information after fusion, maximizing the advantages of each modal, and reducing the information loss caused by the fusion process, how to integrate the information of each modal has become a new challenge that exists in many fields. This paper briefly describes common multimodal fusion methods and fusion architectures, summarizes three common fusion models, and briefly analyzes the advantages and disadvantages of the three architectures of collaboration, joint, and codec, as well as specific fusion methods such as multi-core learning and image models. In the application of multi-modality, it analyzes and summarizes multi-modal video clip retrieval, comprehensive multi-modal information generation content summary, multi-modal sentiment analysis, and multi-modal man-machine dialogue system. The paper also proposes the current problems of multi-modal fusion and the future research directions.

    Related Articles | Metrics
    Review of Text Sentiment Analysis Methods
    WANG Ting, YANG Wenzhong
    Computer Engineering and Applications    2021, 57 (12): 11-24.   DOI: 10.3778/j.issn.1002-8331.2101-0022
    Abstract1438)      PDF(pc) (906KB)(1386)       Save

    Text sentiment analysis is an important branch of natural language processing, which is widely used in public opinion analysis and content recommendation. It is also a hot topic in recent years. According to different methods used, it is divided into sentiment analysis based on emotional dictionary, sentiment analysis based on traditional machine learning, and sentiment analysis based on deep learning. Through comparing these three methods, the research results are analyzed, and the paper summarizes the advantages and disadvantages of different methods, introduces the related data sets and evaluation index, and application scenario, analysis of emotional subtasks is simple summarized. The future research trend and application field of sentiment analysis problem are found. Certain help and guidance are provided for the researchers in the related areas.

    Related Articles | Metrics
    Multi-channel Attention Mechanism Text Classification Model Based on CNN and LSTM
    TENG Jinbao, KONG Weiwei, TIAN Qiaoxin, WANG Zhaoqian, LI Long
    Computer Engineering and Applications    2021, 57 (23): 154-162.   DOI: 10.3778/j.issn.1002-8331.2104-0212
    Abstract1437)      PDF(pc) (844KB)(486)       Save

    Aiming at the problem that traditional Convolutional Neural Network(CNN) and Long Short-Term Memory (LSTM) can not reflect the importance of each word in the text when extracting features, this paper proposes a multi-channel text classification model based on CNN and LSTM. Firstly, CNN and LSTM are used to extract the local information and context features of the text; secondly, multi-channel attention mechanism is used to extract the attention score of the output information of CNN and LSTM; finally, the output information of multi-channel attention mechanism is fused to achieve the effective extraction of text features and focus attention on important words. Experimental results on three public datasets show that the proposed model is better than CNN, LSTM and their improved models, and can effectively improve the effect of text classification.

    Reference | Related Articles | Metrics
    Research Progress of Multi-label Text Classification
    HAO Chao, QIU Hangping, SUN Yi, ZHANG Chaoran
    Computer Engineering and Applications    2021, 57 (10): 48-56.   DOI: 10.3778/j.issn.1002-8331.2101-0096
    Abstract1408)      PDF(pc) (906KB)(1048)       Save

    As a basic task in natural language processing, text classification has been studied in the 1950s. Now the single-label text classification algorithm has matured, but there is still a lot of improvement on multi-label text classification. Firstly, the basic concepts and basic processes of multi-label text classification are introduced, including data set acquisition, text preprocessing, model training and prediction results. Secondly, the methods of multi-label text classification are introduced. These methods are mainly divided into two categories:traditional machine learning methods and the methods based on deep learning. Traditional machine learning methods mainly include problem transformation methods and algorithm adaptation methods. The methods based on deep learning use various neural network models to handle multi-label text classification problems. According to the model structure, they are divided into multi-label text classification methods based on CNN structure, RNN structure and Transfomer structure. The data sets commonly used in multi-label text classification are summarized. Finally, the future development trend is summarized and analyzed.

    Related Articles | Metrics
    YOLOv5 Helmet Wear Detection Method with Introduction of Attention Mechanism
    WANG Lingmin, DUAN Jun, XIN Liwei
    Computer Engineering and Applications    2022, 58 (9): 303-312.   DOI: 10.3778/j.issn.1002-8331.2112-0242
    Abstract1307)      PDF(pc) (1381KB)(734)       Save
    For high-risk industries such as steel manufacturing, coal mining and construction industries, wearing helmets during construction is one of effective ways to avoid injuries. For the current helmet wearing detection model in a complex environment for small and dense targets, there are problems such as false detection and missed detection, an improved YOLOv5 target detection method is proposed to detect the helmet wearing. A coordinate attention mechanism(coordinate attention) is added to the backbone network of YOLOv5, which embeds location information into channel attention so that the network can pay attention on a larger area. The original feature pyramid module in the feature fusion module is replaced with a weighted bi-directional feature pyramid(BiFPN)network structure to achieve efficient bi-directional cross-scale connectivity and weighted feature fusion. The experimental results on the homemade helmet dataset show that the improved YOLOv5 model achieves an average accuracy of 95.9%, which is 5.1 percentage points higher than the YOLOv5 model, and meets the requirements for small and dense target detection in complex environments.
    Reference | Related Articles | Metrics
    Research Progress on Vision System and Manipulator of Fruit Picking Robot
    GOU Yuanmin, YAN Jianwei, ZHANG Fugui, SUN Chengyu, XU Yong
    Computer Engineering and Applications    2023, 59 (9): 13-26.   DOI: 10.3778/j.issn.1002-8331.2209-0183
    Abstract1278)      PDF(pc) (787KB)(849)       Save
    Fruit picking robot is of great significance to the realization of automatic intelligence of fruit equipment. In this paper, the research work on the key technologies of fruit-picking robot at home and abroad in recent years is summarized, firstly, the key technologies of fruit-picking robot vision system, such as traditional image segmentation methods based on fruit features, such as threshold method, edge detection method, clustering algorithm based on color features and region-based image segmentation algorithm, are discussed, the object recognition algorithm based on depth learning and the target fruit location are analyzed and compared, and the state-of-the-art of fruit picking robot manipulator and end-effector is summarized, finally, the development trend and direction of fruit-picking robot in the future are prospected, which can provide reference for the related research of fruit-picking robot.
    Reference | Related Articles | Metrics
    Overview of Multi-Agent Path Finding
    LIU Zhifei, CAO Lei, LAI Jun, CHEN Xiliang, CHEN Ying
    Computer Engineering and Applications    2022, 58 (20): 43-64.   DOI: 10.3778/j.issn.1002-8331.2203-0467
    Abstract1258)      PDF(pc) (1013KB)(539)       Save
    The multi-agent path finding(MAPF) problem is the fundamental problem of planning paths for multiple agents, where the key constraint is that the agents will be able to follow these paths concurrently without colliding with each other. MAPF is widely used in logistics, military, security and other fields. MAPF algorithm can be divided into the centralized planning algorithm and the distributed execution algorithm when the main research results of MAPF at home and abroad are systematically sorted and classified according to different planning methods. The centralized programming algorithm is not only the most classical but also the most commonly used MAPF algorithm. It is mainly divided into four algorithms based on [A*] search, conflict search, cost growth tree and protocol. The other part of MAPF which is the distributed execution algorithm is based on reinforcement learning. According to different improved techniques, the distributed execution algorithm can be divided into three types:the expert demonstration, the improved communication and the task decomposition. The challenges of existing algorithms are pointed out and the future work is forecasted based on the above classification by comparing the characteristics and applicability of MAPF algorithms and analyzing the advantages and disadvantages of existing algorithms.
    Reference | Related Articles | Metrics
    Research Progress of Transformer Based on Computer Vision
    LIU Wenting, LU Xinming
    Computer Engineering and Applications    2022, 58 (6): 1-16.   DOI: 10.3778/j.issn.1002-8331.2106-0442
    Abstract1253)      PDF(pc) (1089KB)(818)       Save
    Transformer is a deep neural network based on the self-attention mechanism and parallel processing data. In recent years, Transformer-based models have emerged as an important area of research for computer vision tasks. Aiming at the current blanks in domestic review articles based on Transformer, this paper covers its application in computer vision. This paper reviews the basic principles of the Transformer model, mainly focuses on the application of seven visual tasks such as image classification, object detection and segmentation, and analyzes Transformer-based models with significant effects. Finally, this paper summarizes the challenges and future development trends of the Transformer model in computer vision.
    Reference | Related Articles | Metrics
    Research on Object Detection Algorithm Based on Improved YOLOv5
    QIU Tianheng, WANG Ling, WANG Peng, BAI Yan’e
    Computer Engineering and Applications    2022, 58 (13): 63-73.   DOI: 10.3778/j.issn.1002-8331.2202-0093
    Abstract1173)      PDF(pc) (1109KB)(497)       Save
    YOLOv5 is an algorithm with good performance in single-stage target detection at present, but the accuracy of target boundary regression is not too high, so it is difficult to apply to scenarios with high requirements on the intersection ratio of prediction boxes. Based on YOLOv5 algorithm, this paper proposes a new model YOLO-G with low hardware requirements, fast model convergence and high accuracy of target box. Firstly, the feature pyramid network(FPN) is improved, and more features are integrated in the way of cross-level connection, which prevents the loss of shallow semantic information to a certain extent. At the same time, the depth of the pyramid is deepened, corresponding to the increase of detection layer, so that the laying interval of various anchor frames is more reasonable. Secondly, the attention mechanism of parallel mode is integrated into the network structure, which gives the same priority to spatial and channel attention module, then the attention information is extracted by weighted fusion, so that the network can fuse the mixed domain attention according to the attention degree of spatial and channel attention. Finally, in order to prevent the loss of real-time performance due to the increase of model complexity, the network is lightened to reduce the number of parameters and computation of the network. PASCAL VOC datasets of 2007 and 2012 are used to verify the effectiveness of the algorithm. Compared with YOLOv5s, YOLO-G reduces the number of parameters by 4.7% and the amount of computation by 47.9%, while mAP@0.5 and mAP@0.5:0.95 increases by 3.1 and 5.6 percentage points respectively.
    Reference | Related Articles | Metrics
    Overview on Reinforcement Learning of Multi-agent Game
    WANG Jun, CAO Lei, CHEN Xiliang, LAI Jun, ZHANG Legui
    Computer Engineering and Applications    2021, 57 (21): 1-13.   DOI: 10.3778/j.issn.1002-8331.2104-0432
    Abstract1100)      PDF(pc) (779KB)(1214)       Save

    The use of deep reinforcement learning to solve single-agent tasks has made breakthrough progress. Since the complexity of multi-agent systems, common algorithms cannot solve the main difficulties. At the same time, due to the increase in the number of agents, taking the expected value of maximizing the cumulative return of a single agent as the learning goal often fails to converge and some special convergence points do not satisfy the rationality of the strategy. For practical problems that there is no optimal solution, the reinforcement learning algorithm is even more helpless. The introduction of game theory into reinforcement learning can solve the interrelationship of agents very well and explain the rationality of the strategy corresponding to the convergence point. More importantly, it can use the equilibrium solution to replace the optimal solution in order to obtain a relatively effective strategy. Therefore, this article investigates the reinforcement learning algorithms that have emerged in recent years from the perspective of game theory, summarizes the important and difficult points of current game reinforcement learning algorithms and gives several breakthrough directions that may solve the above-mentioned difficulties.

    Reference | Related Articles | Metrics
    Survey of Opponent Modeling Methods and Applications in Intelligent Game Confrontation
    WEI Tingting, YUAN Weilin, LUO Junren, ZHANG Wanpeng
    Computer Engineering and Applications    2022, 58 (9): 19-29.   DOI: 10.3778/j.issn.1002-8331.2202-0297
    Abstract1068)      PDF(pc) (904KB)(411)       Save
    Intelligent game confrontation has always been the focus of artificial intelligence research. In the game confrontation environment, the actions, goals, strategies, and other related attributes of agent can be inferred by opponent modeling, which provides key information for game strategy formulation. The application of opponent modeling method in competitive games and combat simulation is promising, and the formulation of game strategy must be premised on the action strategy of all parties in the game, so it is especially important to establish an accurate model of opponent behavior to predict its intention. From three dimensions of connotation, method, and application, the necessity of opponent modeling is expounded and the existing modeling methods are classified. The prediction method based on reinforcement learning, reasoning method based on theory of mind, and optimization method based on Bayesian are summarized. Taking the sequential game(Texas Hold’em), real-time strategy game(StarCraft), and meta-game as typical application scenarios, the role of opponent modeling in intelligent game confrontation is analyzed. Finally, the development of adversary modeling technology prospects from three aspects of bounded rationality, deception strategy and interpretability.
    Reference | Related Articles | Metrics
    Overview of Chinese Domain Named Entity Recognition
    JIAO Kainan, LI Xin, ZHU Rongchen
    Computer Engineering and Applications    2021, 57 (16): 1-15.   DOI: 10.3778/j.issn.1002-8331.2103-0127
    Abstract1065)      PDF(pc) (928KB)(644)       Save

    Named Entity Recognition(NER), as a classic research topic in the field of natural language processing, is the basic technology of intelligent question answering, knowledge graph and other tasks. Domain Named Entity Recognition(DNER) is the domain-specific NER scheme. Drived by deep learning technology, Chinese DNER has made a breakthrough. Firstly, this paper summarizes the research framework of Chinese DNER, and reviews the existing research results from four aspects:the determination of domain data sources, the establishment of domain entity types and specifications, the annotation of domain data sets, and the evaluation metrics of Chinese DNER. Then, this paper summarizes the current common technology framework of Chinese DNER, introduces the pattern matching method based on dictionaries and rules, statistical machine learning method, deep learning method, multi-party fusion deep learning method, and focuses on the analysis of Chinese DNER method based on word vector representation and deep learning. Finally, the typical application scenarios of Chinese DNER are discussed, and the future development direction is prospected.

    Related Articles | Metrics
    Review of Neural Style Transfer Models
    TANG Renwei, LIU Qihe, TAN Hao
    Computer Engineering and Applications    2021, 57 (19): 32-43.   DOI: 10.3778/j.issn.1002-8331.2105-0296
    Abstract1055)      PDF(pc) (1078KB)(704)       Save

    Neural Style Transfer(NST) technique is used to simulate different art styles of images and videos, which is a popular topic in computer vision. This paper aims to provide a comprehensive overview of the current progress towards NST. Firstly, the paper reviews the Non-Photorealistic Rendering(NPR) technique and traditional texture transfer. Then, the paper categorizes current major NST methods and gives a detailed description of these methods along with their subsequent improvements. After that, it discusses various applications of NST and presents several evaluation methods which compares different style transfer models both qualitatively and quantitatively. In the end, it summarizes the existing problems and provides some future research directions for NST.

    Reference | Related Articles | Metrics
    Review on Integration Analysis and Application of Multi-omics Data
    ZHONG Yating, LIN Yanmei, CHEN Dingjia, PENG Yuzhong, ZENG Yuanpeng
    Computer Engineering and Applications    2021, 57 (23): 1-17.   DOI: 10.3778/j.issn.1002-8331.2106-0341
    Abstract1015)      PDF(pc) (806KB)(1655)       Save

    With the continuous emergence and popularization of new omics sequencing technology, a large number of omics data have been produced, which is of great significance for people to further study and reveal the mysteries of life. Using multi-omics data to integrate and analyze life science problems can obtain more abundant and more comprehensive information related to life system, which has become a new direction for scientists to explore the mechanism of life. This paper introduces the research background and significance of multi-omics data integration analysis, summarizes the methods of data integration analysis of multiomics in recent years and the applied research in related fields, and finally discusses the current existing problems and future prospects of multi-omics data integration analysis methods.

    Reference | Related Articles | Metrics
    Research on Local Path Planning Algorithm Based on Improved TEB Algorithm
    DAI Wanyu, ZHANG Lijuan, WU Jiafeng, MA Xianghua
    Computer Engineering and Applications    2022, 58 (8): 283-288.   DOI: 10.3778/j.issn.1002-8331.2108-0290
    Abstract984)      PDF(pc) (878KB)(193)       Save
    When the traditional TEB(time elastic band) algorithm is used to plan the path in a complex dynamic environment, path vibrations caused by the unsmooth speed control amount will occur, which will bring greater impact to the robot and prone to collisions. Aiming at the above problems, the traditional TEB algorithm is improved. The detected irregular obstacles are expansion treatment and regional classification strategy, and the driving route in the safe area is given priority to make the robot run more safely and smoothly in the complex environment. Adding the obstacle distance to the speed constraint in the algorithm can effectively reduce the vibration amplitude and the impact of the robot during the path driving process caused by the speed jump after the robot approaches the obstacle, so as to ensure the safety of the robot during operation. A large number of comparative simulations in the ROS environment show that in a complex dynamic environment, the path planned by the improved TEB algorithm is safer and smoother, which can effectively reduce the impact of the robot.
    Reference | Related Articles | Metrics
    Overview of Research on 3D Human Pose Estimation
    WANG Faming, LI Jianwei, CHEN Sixi
    Computer Engineering and Applications    2021, 57 (10): 26-38.   DOI: 10.3778/j.issn.1002-8331.2102-0039
    Abstract979)      PDF(pc) (1035KB)(1179)       Save

    The 3D human pose estimation is essentially a classification and regression problem. It mainly estimates the 3D human pose from images. The 3D human pose estimation based on traditional methods and deep learning methods is the mainstream research method in this field. This paper follows the traditional methods to the deep learning methods to systematically introduce the 3D human posture estimation methods in recent years, and basically understands the traditional methods to obtain the many elements of the human posture through the generation and discrimination methods to complete the 3D human posture estimation. The 3D human pose estimation method based on deep learning mainly regresses the human pose information from the image features by constructing a neural network. It can be roughly divided into three categories:based on direct regression methods, based on 2D information methods, and based on hybrid methods. In the end, it summarizes the current research difficulties and challenges, and discusses the research trends.

    Related Articles | Metrics
    Overview of Visual Multi-object Tracking Algorithms with Deep Learning
    ZHANG Yao, LU Huanzhang, ZHANG Luping, HU Moufa
    Computer Engineering and Applications    2021, 57 (13): 55-66.   DOI: 10.3778/j.issn.1002-8331.2102-0260
    Abstract958)      PDF(pc) (931KB)(843)       Save

    Visual multi-object tracking is a hot issue in the field of computer vision. However, the uncertainty of the number of targets in the scene, the mutual occlusion between targets, and the difficulties of discrimination between target features has led to slow progress in the real-world application of visual multi-target tracking. In recent years, with the continuous in-depth research of visual intelligent processing, a variety of deep learning visual multi-object tracking algorithms have emerged. Based on the analysis of the challenges and difficulties faced by visual multi-object tracking, the algorithm is divided into Detection-Based Tracking(DBT) and Joint Detection Tracking(JDT) two categories and six sub-categories class, and studied about its advantages and disadvantages. The analysis shows that the DBT algorithm has a simple structure, but the correlation of each sub-step of the algorithm is not high. The JDT algorithm integrates multi-module joint learning and is dominant in multiple tracking evaluation indicators. The feature extraction module is the key to solve the target occlusion in the DBT algorithm with the expense of the speed of the algorithm, and the JDT algorithm is more dependent on the detection module. At present, multi-object tracking is generally developed from DBT-type algorithms to JDT, achieving a balance between algorithm accuracy and speed in stages. The future development direction of the multi-object tracking algorithm in terms of datasets, sub-modules, and specific scenarios is proposed.

    Related Articles | Metrics
    Survey on Zero-Shot Learning
    WANG Zeshen,YANG Yun,XIANG Hongxin, LIU Qing
    Computer Engineering and Applications    2021, 57 (19): 1-17.   DOI: 10.3778/j.issn.1002-8331.2106-0133
    Abstract943)      PDF(pc) (1267KB)(555)       Save

    Although there have been well developed in zero-shot learning since the development of deep learning, in the aspect of the application, zero-shot learning did not have a good system to order it. This paper overviews theoretical systems of zero-shot learning, typical models, application systems, present challenges and future research directions. Firstly, it introduces the theoretical systems from definition of zero-shot learning, essential problems, and commonly used data sets. Secondly, some typical models of zero-shot learning are described in chronological order. Thirdly, it presents the application systems about of zero-shot learning from the three dimensions, such as words, images and videos. Finally, the paper analyzes the challenges and future research directions in zero-shot learning.

    Reference | Related Articles | Metrics
    Review of Typical Object Detection Algorithms for Deep Learning
    XU Degang, WANG Lu, LI Fan
    Computer Engineering and Applications    2021, 57 (8): 10-25.   DOI: 10.3778/j.issn.1002-8331.2012-0449
    Abstract937)      PDF(pc) (736KB)(779)       Save

    Object detection is an important research direction of computer vision, its purpose is to accurately identify the category and location of a specific target object in a given image. In recent years, the feature learning and transfer learning capabilities of deep convolutional neural networks have made significant progress in target detection algorithm feature extraction, image expression, classification and recognition. This paper introduces the research progress of target detection algorithm based on deep learning, the characteristics of common data sets and the key parameters of performance index evaluation, compares and analyzes the network structure and implementation mode of target detection algorithm formed by two-stage, single-stage and other improved algorithms. Finally, the application progress of the algorithm in the detection of human faces, salient targets, pedestrians, remote sensing images, medical images, and grain insects is described. Combined with the current problems and challenges, the future research directions are analyzed.

    Related Articles | Metrics
    Survey of Deep Clustering Algorithm Based on Autoencoder
    TAO Wenbin, QIAN Yurong, ZHANG Yiyang, MA Hengzhi, LENG Hongyong, MA Mengnan
    Computer Engineering and Applications    2022, 58 (18): 16-25.   DOI: 10.3778/j.issn.1002-8331.2204-0049
    Abstract914)      PDF(pc) (724KB)(321)       Save
    As a common analysis method, cluster analysis is widely used in various scenarios. With the development of machine learning technology, deep clustering algorithm has also become a hot research topic, and the deep clustering algorithm based on autoencoder is one of the representative algorithms. To keep abreast of the development of deep clustering algorithms based on autoencoders, four models of autoencoders are introduced, and the representative algorithms in recent years are classified according to the structure of autoencoders. For the traditional clustering algorithm and the deep clustering algorithm based on autoencoder, experiments are compared and analyzed on the MNIST, USPS, Fashion-MNIST datasets. At last, the current problems of deep clustering algorithms based on autoencoders are summarized, and the possible research directions of deep clustering algorithms are prospected.
    Reference | Related Articles | Metrics
    Robot Dynamic Path Planning Based on Improved A* and DWA Algorithm
    LIU Jianjuan, XUE Liqi, ZHANG Huijuan, LIU Zhongpu
    Computer Engineering and Applications    2021, 57 (15): 73-81.   DOI: 10.3778/j.issn.1002-8331.2103-0525
    Abstract907)      PDF(pc) (1452KB)(792)       Save

    Traditional A* algorithm is one of the commonly used algorithms for global path planning of mobile robot, but the algorithm has low search efficiency, many turning points in planning path, and can’t achieve dynamic path planning in the face of random dynamic obstacles in complex environment. To solve these problems, the improved A* algorithm and DWA algorithm are integrated on the basis of global optimization. The obstacle information in the environment is quantified, and the weight of heuristic function of A* algorithm is adjusted according to the information to improve the efficiency and flexibility of the algorithm. Based on the Floyd algorithm, the optimization algorithm of path nodes is designed, which can delete redundant nodes, reduce turning points and improve the path smoothness. The dynamic window evaluation function of DWA algorithm is designed based on the global optimal, which is used to distinguish known obstacles from unknown dynamic and static obstacles, and the key points of the improved A* algorithm planning path are extracted as the temporary target points of DWA algorithm. On the basis of the global optimal, the fusion of the improved A* algorithm and DWA algorithm is realized. The experimental results show that, in the complex environment, the fusion algorithm can not only ensure the global optimal path planning, but also effectively avoid the dynamic and static obstacles in the environment, and realize the dynamic path planning in the complex environment.

    Related Articles | Metrics
    Survey of Transformer-Based Object Detection Algorithms
    LI Jian, DU Jianqiang, ZHU Yanchen, GUO Yongkun
    Computer Engineering and Applications    2023, 59 (10): 48-64.   DOI: 10.3778/j.issn.1002-8331.2211-0133
    Abstract864)      PDF(pc) (875KB)(487)       Save
    Transformer is a kind of deep learning framework with strong modeling and parallel computing capabilities. At present, object detection algorithm based on Transformer has become a hotspot. In order to further explore new ideas and directions, this paper summarizes the existing object detection algorithm based on Transformer as well as a variety of object detection data sets and their application scenarios. This paper describes the correlation algorithms for Transformer based object detection from four aspects, i.e. feature extraction, object estimation, label matching policy and application of algorithm, compares the Transformer algorithm with the object detection algorithm based on convolutional neural network, analyzes the advantages and disadvantages of Transformer in object detection task, and proposes a general framework for Transformer based object detection model. Finally, the prospect of development trend of Transformer in the field of object detection is put forward.
    Reference | Related Articles | Metrics
    Application of Deep Reinforcement Learning Algorithm on Intelligent Military Decision System
    KUANG Liqun, LI Siyuan, FENG Li, HAN Xie, XU Qingyu
    Computer Engineering and Applications    2021, 57 (20): 271-278.   DOI: 10.3778/j.issn.1002-8331.2104-0114
    Abstract834)      PDF(pc) (1223KB)(531)       Save

    Deep reinforcement learning algorithm can well achieve discrete decision-making behavior, but it is difficult to apply to the highly complex and continuous modern battlefield situations, and the algorithm is difficult to converge in multi-agent environment. To solve these problems, an improved Deep Deterministic Policy Gradient(DDPG) algorithm is proposed, which introduces the experience replay technology based on priority and single training mode to improve the convergence speed of the algorithm; at the same time, an exploration strategy of mixed double noise is designed in the algorithm to realize complex and continuous military decision-making and control behavior. The intelligent military decision simulation platform based on the improved DDPG algorithm is developed by unity3D. The simulation environment of Blue Army Infantry attacking Red Army military base is built to simulate multi-agent combat training. The experimental results show that the algorithm can drive multiple combat agents to complete tactical maneuvers and achieve tactical behaviors, such as bypassing obstacles to reach the dominant area for shooting. The algorithm has faster convergence speed and better stability. It can get higher round rewards, and achieves the purpose of improving the efficiency of intelligent military decision-making.

    Reference | Related Articles | Metrics
    Survey of Interpretability Research on Deep Learning Models
    ZENG Chunyan, YAN Kang, WANG Zhifeng, YU Yan, JI Chunmei
    Computer Engineering and Applications    2021, 57 (8): 1-9.   DOI: 10.3778/j.issn.1002-8331.2012-0357
    Abstract831)      PDF(pc) (677KB)(1208)       Save

    With the characteristics of data-driven learning, deep learning technology has made great achievements in the fields of natural language processing, image processing, and speech recognition. However, due to the deep learning model featured by deep networks, many parameters, high complexity and other characteristics, the decisions and intermediate processes made by the model are difficult for humans to understand. Therefore, exploring the interpretability of deep learning has become a new topic in the current artificial intelligence field. This review takes the interpretability of deep learning models as the research object and summarizes its progress. Firstly, the main interpretability methods are summarized and analyzed from four aspects:self-explanatory model, model-specific explanation, model-agnostic explanation, and causal interpretability. At the same time, it enumerates the application of interpretability related technologies, and finally discusses the existing problems of current interpretability research to promote the further development of the deep learning interpretability research framework.

    Related Articles | Metrics
    Review of Path Planning Algorithms for Mobile Robots
    LIN Hanxi, XIANG Dan, OUYANG Jian, LAN Xiaodong
    Computer Engineering and Applications    2021, 57 (18): 38-48.   DOI: 10.3778/j.issn.1002-8331.2103-0519
    Abstract830)      PDF(pc) (865KB)(448)       Save

    Path planning is one of the hot research topics of mobile robot, and it is the key technology to realize autonomous navigation of robot. In this paper, the path planning algorithms of mobile robots are studied to understand the development and application of path planning algorithms under different environments, and the research status and development of path planning are systematically summarized. According to the characteristics of mobile robot path planning, it is divided into intelligent search algorithm, artificial intelligence-based algorithm, geometric model based algorithm and local obstacle avoidance algorithm. Based on the above classification, this paper introduces the representative research results in recent years, analyzes the advantages and disadvantages of various planning algorithms, and forecasts the future development trend of mobile robot path planning, which provides some ideas for the research of robot path planning.

    Related Articles | Metrics
    COVID-19 Medical Imaging Dataset and Research Progress
    LIU Rui, DING Hui, SHANG Yuanyuan, SHAO Zhuhong, LIU Tie
    Computer Engineering and Applications    2021, 57 (22): 15-27.   DOI: 10.3778/j.issn.1002-8331.2106-0118
    Abstract825)      PDF(pc) (1013KB)(356)       Save

    As imaging technology has been playing an important role in the diagnosis and evaluation of the new coronavirus(COVID-19), COVID-19 related datasets have been successively published. But few review articles discuss COVID-19 image processing, especially in datasets. To this end, the new coronary pneumonia datasets and deep learning models are sorted and analyzed, through COVID-19-related journal papers, reports, and related open-source dataset websites, which include Computer Tomography(CT) image and X-rays(CXR)image datasets. At the same time, the characteristics of the medical images presented by these datasets are analyzed. This paper focuses on collating and describing open-source datasets related to COVID-19 medical imaging. In addition, some important segmentation and classification models that perform well on the related datasets are analyzed and compared. Finally, this paper discusses the future development trend on lung imaging technology.

    Reference | Related Articles | Metrics
    Review of Visual Odometry Methods Based on Deep Learning
    ZHI Henghui, YIN Chenyang, LI Huibin
    Computer Engineering and Applications    2022, 58 (20): 1-15.   DOI: 10.3778/j.issn.1002-8331.2203-0480
    Abstract816)      PDF(pc) (904KB)(438)       Save
    Visual odometry(VO) is a common method to deal with the positioning of mobile devices equipped with vision sensors, and has been widely used in autonomous driving, mobile robots, AR/VR and other fields. Compared with traditional model-based methods, deep learning-based methods can learn efficient and robust feature representations from data without explicit computation, thereby improving their ability to handle challenging scenes such as illumination changes and less textures. In this paper, it first briefly reviews the model-based visual odometry methods, and then focuses on six aspects of deep learning-based visual odometry methods, including supervised learning methods, unsupervised learning methods, model-learning fusion methods, common datasets, evaluation metrics, and comparison of models and deep learning methods. Finally, existing problems and future development trends of deep learning-based visual odometry are discussed.
    Reference | Related Articles | Metrics
    Quick Semantic Segmentation Network Based on U-Net and MobileNet-V2
    LAN Tianxiang, XIANG Ziyu, LIU Mingguo, CHEN Kai
    Computer Engineering and Applications    2021, 57 (17): 175-180.   DOI: 10.3778/j.issn.1002-8331.2005-0278
    Abstract811)      PDF(pc) (1156KB)(253)       Save

    The U-Net model is large and accordingly has relatively slow speed on image processing. This drawback makes it difficult to satisfy the requirement of industrial real-time applications. On the consideration of the question above, this paper designs a light-weight full convolution neural network named LU-Net. In the proposed network, it integrates the thought of MobileNet-V2 into the U-Net framework. The depth separable convolution method in MobileNet-V2 is efficient to reduce the parameters and the computation complexity of the proposed network. The proposed network also reserves the advantages of normal convolution and bottleneck model. Accordingly, it is efficient to utilize the high-level features to keep the accuracy and reduce the processing time simultaneously. The experiments on hollow symbol dataset and DRIVE dataset indicate that, on the comparison with U-Net, the parameters of the proposed LU-Net is 0.59×106,which is 1.9% of the original model, and the processing speed is 5 times faster. Under the experimental environment, LU-Net takes only 25?ms to process a picture under the resolution of 360×270 size. LU-Net is a promising method for the industrial real-time applications on picture processing.

    Related Articles | Metrics
    Review on Semantic Segmentation of UAV Aerial Images
    CHENG Qing, FAN Man, LI Yandong, ZHAO Yuan, LI Chenglong
    Computer Engineering and Applications    2021, 57 (19): 57-69.   DOI: 10.3778/j.issn.1002-8331.2105-0423
    Abstract806)      PDF(pc) (926KB)(597)       Save

    With the rapid development of Unmanned Aerial Vehicle(UAV) technology, research institutions and industries have attached importance of UAV’s application. Optical images and videos are vital for the UAV to sense the environment, occupying an important position in UAV vision. As a hot spot of the current research of computer vision, semantic segmentation is widely investigated in the fields of unmanned driving and intelligent robot. Semantic segmentation of UAV aerial images is based on the UAV aerial image semantic segmentation technology to enable the UAV to work in complex scenes. First of all, a brief introduction to the semantic segmentation technology and the application development of UAV is given. Meanwhile, the relevant UAV aerial data sets, characteristics of aerial images and commonly used evaluation metrics for semantic segmentation are introduced. Secondly, according to the characteristics of UAV aerial images, it introduces the relevant semantic segmentation methods. In this section, analysis and comparison are made in three aspects including the small object detection, the real-time performance of the models and the multi-scale information integration. Finally, the related applications of semantic segmentation for UAV are reviewed, including line detection, the application of agriculture and building extraction, and analysis of the development trend and challenges in the future is made.

    Reference | Related Articles | Metrics
    Survey on Image Semantic Segmentation in Dilemma of Few-Shot
    WEI Ting, LI Xinlei, LIU Hui
    Computer Engineering and Applications    2023, 59 (2): 1-11.   DOI: 10.3778/j.issn.1002-8331.2205-0496
    Abstract801)      PDF(pc) (4301KB)(588)       Save
    In recent years, image semantic segmentation has developed rapidly due to the emergence of large-scale datasets. However, in practical applications, it is not easy to obtain large-scale, high-quality images, and image annotation also consumes a lot of manpower and time costs. In order to get rid of the dependence on the number of samples, few-shot semantic segmentation has gradually become a research hotspot. The current few-shot semantic segmentation methods mainly use the idea of meta-learning, which can be divided into three categories:based on the siamese neural network, based on the prototype network and based on the attention mechanism according to different model structures. Based on the current research, this paper introduces the development, advantages and disadvantages of various methods for few-shot semantic segmentation, as well as common datasets and experimental designs. On this basis, the application scenarios and future development directions are summarized.
    Reference | Related Articles | Metrics
    Overview of Image Quality Assessment Method Based on Deep Learning
    CAO Yudong, LIU Haiyan, JIA Xu, LI Xiaohui
    Computer Engineering and Applications    2021, 57 (23): 27-36.   DOI: 10.3778/j.issn.1002-8331.2106-0228
    Abstract766)      PDF(pc) (646KB)(838)       Save

    Image quality evaluation is a measurement of the visual quality of an image or video. The researches on image quality evaluation algorithms in the past 10 years are reviewed. First, the measurement indicators of image quality evaluation algorithm and image quality evaluation datasets are introduced. Then, the different classification of image quality evaluation methods are analyzed, and image quality evaluation algorithms with deep learning technology are focused on, basic model of which is deep convolutional network, deep generative adversarial network and transformer. The performance of algorithms with deep learning is often higher than that of traditional image quality assessment algorithms. Subsequently, the principle of image quality assessment with deep learning is described in detail. A specific no-reference image quality evaluation algorithm based on deep generative adversarial network is introduced, which improves the reliability of simulated reference images through enhanced confrontation learning. Deep learning technology requires massive data support. Data enhancement methods are elaborated to improve the performance of the model. Finally, the future research trend of digital image quality evaluation is summarized.

    Reference | Related Articles | Metrics
    Review of Research on Generative Adversarial Networks and Its Application
    WEI Fuqiang, Gulanbaier Tuerhong, Mairidan Wushouer
    Computer Engineering and Applications    2021, 57 (19): 18-31.   DOI: 10.3778/j.issn.1002-8331.2104-0248
    Abstract764)      PDF(pc) (1078KB)(1152)       Save

    The theoretical research and applications of generative adversarial networks have been continuously successful and have become one of the current hot spots of research in the field of deep learning. This paper provides a systematic review of the theory of generative adversarial networks and their applications in terms of types of models, evaluation criteria and theoretical research progress; analyzing the strengths and weaknesses of generative models with explicit and implicit density-based, respectively; summarizing the evaluation criteria of generative adversarial networks, interpreting the relationship between the criteria, and introduces the research progress of the generative adversarial network in image generation from the application level, that is, through the image conversion, image generation, image restoration, video generation, text generation and image super-resolution applications; analyzing the theoretical research progress of generative adversarial networks from the perspectives of interpretability, controllability, stability and model evaluation methods. Finally, the paper discusses the challenges of studying generative adversarial networks and looks forward to the possible future directions of development.

    Reference | Related Articles | Metrics
    Research Progress of Medical Image Registration Technology Based on Deep Learning
    GUO Yanfen, CUI Zhe, YANG Zhipeng, PENG Jing, HU Jinrong
    Computer Engineering and Applications    2021, 57 (15): 1-8.   DOI: 10.3778/j.issn.1002-8331.2101-0281
    Abstract762)      PDF(pc) (681KB)(727)       Save

    Medical image registration technology has a wide range of application values for lesion detection, clinical diagnosis, surgical planning, and efficacy evaluation. This paper systematically summarizes the registration algorithm based on deep learning, and analyzes the advantages and limitations of various methods from deep iteration, full supervision, weak supervision to unsupervised learning. In general, unsupervised learning has become the mainstream direction of medical image registration research, because it does not rely on golden standards and uses an end-to-end network to save time. Meanwhile, compared with other methods, unsupervised learning can achieve higher accuracy and spends shorter time. However, medical image registration methods based on unsupervised learning also face some research difficulties and challenges in terms of interpretability, cross-modal diversity, and repeatable scalability in the field of medical images, which points out the research direction for achieving more accurate medical image registration methods in the future.

    Related Articles | Metrics
    Review of Research on Small Target Detection Based on Deep Learning
    ZHANG Yan, ZHANG Minglu, LYU Xiaoling, GUO Ce, JIANG Zhihong
    Computer Engineering and Applications    2022, 58 (15): 1-17.   DOI: 10.3778/j.issn.1002-8331.2112-0176
    Abstract755)      PDF(pc) (995KB)(421)       Save
    The task of target detection is to quickly and accurately identify and locate predefined categories of objects from an image. With the development of deep learning techniques, detection algorithms have achieved good results for large and medium targets in the industry. The performance of small target detection algorithms based on deep learning still needs further improvement and optimization due to the characteristics of small targets in images such as small size, incomplete features and large gap between them and the background. Small target detection has a wide demand in many fields such as autonomous driving, medical diagnosis and UAV navigation, so the research has high application value. Based on extensive literature research, this paper firstly defines small target detection and finds the current difficulties in small target detection. It analyzes the current research status from six research directions based on these difficulties and summarizes the advantages and disadvantages of each algorithm. It makes reasonable predictions and outlooks on the future research directions in this field by combining the literature and the development status to provide a certain basic reference for subsequent research. This paper makes a reasonable prediction and outlook on the future research direction in this field, combining the literature and the development status to provide some basic reference for subsequent research.
    Reference | Related Articles | Metrics
    Review of Deep Reinforcement Learning Model Research on Vehicle Routing Problems
    YANG Xiaoxiao, KE Lin, CHEN Zhibin
    Computer Engineering and Applications    2023, 59 (5): 1-13.   DOI: 10.3778/j.issn.1002-8331.2210-0153
    Abstract737)      PDF(pc) (1036KB)(426)       Save
    Vehicle routing problem(VRP) is a classic NP-hard problem, which is widely used in transportation, logistics and other fields. With the scale of problem and dynamic factor increasing, the traditional method of solving the VRP is challenged in computational speed and intelligence. In recent years, with the rapid development of artificial intelligence technology, in particular, the successful application of reinforcement learning in AlphaGo provides a new idea for solving routing problems. In view of this, this paper mainly summarizes the recent literature using deep reinforcement learning to solve VRP and its variants. Firstly, it reviews the relevant principles of DRL to solve VRP and sort out the key steps of DRL-based to solve VRP. Then it systematically classifies and summarizes the pointer network, graph neural network, Transformer and hybrid models four types of solving methods, meanwhile this paper also compares and analyzes the current DRL-based model performance in solving VRP and its variants. Finally, this paper sums up the challenge of DRL-based to solve VRP and future research directions.
    Reference | Related Articles | Metrics