Computer Engineering and Applications

Select

Progress on Deep Reinforcement Learning in Path Planning

ZHANG Rongxia, WU Changxu, SUN Tongchao, ZHAO Zengshun

Computer Engineering and Applications 2021, 57 (19): 44-56. DOI: 10.3778/j.issn.1002-8331.2104-0369

Abstract （2751）

PDF（pc）（1134KB）（1148）

Save

The purpose of path planning is to allow the robot to avoid obstacles and quickly plan the shortest path during the movement. Having analyzed the advantages and disadvantages of the reinforcement learning based path planning algorithm, the paper derives a typical deep reinforcement learning, Deep Q-learning Network（DQN） algorithm that can perform excellent path planning in a complex dynamic environment. Firstly, the basic principles and limitations of the DQN algorithm are analyzed in depth, and the advantages and disadvantages of various DQN variant algorithms are compared from four aspects：the training algorithm, the neural network structure, the learning mechanism and AC（Actor-Critic） framework. The paper puts forward the current challenges and problems to be solved in the path planning method based on deep reinforcement learning. The future development directions are proposed, which can provide reference for the development of intelligent path planning and autonomous driving.

Reference | Related Articles | Metrics

Select

Improved Lightweight Attention Model Based on CBAM

FU Guodong, HUANG Jin, YANG Tao, ZHENG Siyu

Computer Engineering and Applications 2021, 57 (20): 150-156. DOI: 10.3778/j.issn.1002-8331.2101-0369

Abstract （2537）

PDF（pc）（808KB）（810）

Save

In recent years, the attention model has been widely used in the field of computer vision. By adding the attention module to the convolutional neural network, the performance of the network can be significantly improved. However, most of the existing methods focus on the development of more complex attention modules to enable the convolutional neural network to obtain stronger feature expression capabilities, but this also inevitably increases the complexity of the model. In order to achieve a balance between performance and complexity, a lightweight EAM（Efficient Attention Module） model is proposed to optimize the CBAM model. For the channel attention module of CBAM, one-dimensional convolution is introduced to replace the fully connected layer to aggregate the channels. For the spatial attention module of CBAM, the large convolution kernel is replaced with a dilated convolution to increase the receptive field for aggregation Broader spatial context information. After integrating the model into YOLOv4 and testing it on the VOC2012 data set, mAP is increased by 3.48 percentage points. Experimental results show that the attention model only introduces a small amount of parameters, and the network performance can be greatly improved.

Reference | Related Articles | Metrics

Select

Review of Development and Application of Artificial Neural Network Models

ZHANG Chi, GUO Yuan, LI Ming

Computer Engineering and Applications 2021, 57 (11): 57-69. DOI: 10.3778/j.issn.1002-8331.2102-0256

Abstract （2228）

PDF（pc）（781KB）（1962）

Save

Artificial neural networks are increasingly closely related to other subject areas. People solve problems in various fields by exploring and improving the layer structure of artificial neural networks. Based on the analysis of artificial neural networks related literature, this paper summarizes the history of artificial neural network growth and presents relevant principles of artificial neural networks based on the development of neural networks, including multilayer perceptron, back-propagation algorithm, convolutional neural network and recurrent neural network, explains the classic convolutional neural network model in the development of the convolutional neural network and the widely used variant network structure in the recurrent neural network, reviews the application of each artificial neural network algorithm in related fields, summarizes the possible direction of development of the artificial neural network.

Related Articles | Metrics

Select

Survey of Multimodal Data Fusion

REN Zeyu, WANG Zhenchao, KE Zunwang, LI Zhe, Wushour·Silamu

Computer Engineering and Applications 2021, 57 (18): 49-64. DOI: 10.3778/j.issn.1002-8331.2104-0237

Abstract （1911）

PDF（pc）（1214KB）（1881）

Save

With the rapid development of information technology, information exists in various forms and sources. Different forms of existence or information sources can be referred to as one modal, and data composed of two or more modalities is called multi-modal data. Multi-modal data fusion is responsible for effectively integrating the information of multiple modalities, absorbing the advantages of different modalities, and completing the integration of information. Natural phenomena have very rich characteristics, and it is difficult for a single mode to provide complete information about a certain phenomenon. Faced with the fusion requirements of maintaining the diversity and completeness of the modal information after fusion, maximizing the advantages of each modal, and reducing the information loss caused by the fusion process, how to integrate the information of each modal has become a new challenge that exists in many fields. This paper briefly describes common multimodal fusion methods and fusion architectures, summarizes three common fusion models, and briefly analyzes the advantages and disadvantages of the three architectures of collaboration, joint, and codec, as well as specific fusion methods such as multi-core learning and image models. In the application of multi-modality, it analyzes and summarizes multi-modal video clip retrieval, comprehensive multi-modal information generation content summary, multi-modal sentiment analysis, and multi-modal man-machine dialogue system. The paper also proposes the current problems of multi-modal fusion and the future research directions.

Related Articles | Metrics

Select

Review of Attention Mechanism in Convolutional Neural Networks

ZHANG Chenjia, ZHU Lei, YU Lu

Computer Engineering and Applications 2021, 57 (20): 64-72. DOI: 10.3778/j.issn.1002-8331.2105-0135

Abstract （1894）

PDF（pc）（973KB）（1228）

Save

Attention mechanism is widely used in deep learning tasks because of its excellent effect and plug and play convenience. This paper mainly focuses on convolution neural network, introduces various mainstream methods in the development process of convolution network attention mechanism, extracts and summarizes its core idea and implementation process, realizes each attention mechanism method, and makes comparative experiments and results analysis on the measured data of the same type of emitter equipment. According to the main ideas and experimental results, the research status and future development direction of attention mechanism in convolutional networks are summarized.

Reference | Related Articles | Metrics

Select

Review of Text Sentiment Analysis Methods

WANG Ting, YANG Wenzhong

Computer Engineering and Applications 2021, 57 (12): 11-24. DOI: 10.3778/j.issn.1002-8331.2101-0022

Abstract （1722）

PDF（pc）（906KB）（1527）

Save

Text sentiment analysis is an important branch of natural language processing, which is widely used in public opinion analysis and content recommendation. It is also a hot topic in recent years. According to different methods used, it is divided into sentiment analysis based on emotional dictionary, sentiment analysis based on traditional machine learning, and sentiment analysis based on deep learning. Through comparing these three methods, the research results are analyzed, and the paper summarizes the advantages and disadvantages of different methods, introduces the related data sets and evaluation index, and application scenario, analysis of emotional subtasks is simple summarized. The future research trend and application field of sentiment analysis problem are found. Certain help and guidance are provided for the researchers in the related areas.

Related Articles | Metrics

Select

Multi-channel Attention Mechanism Text Classification Model Based on CNN and LSTM

TENG Jinbao, KONG Weiwei, TIAN Qiaoxin, WANG Zhaoqian, LI Long

Computer Engineering and Applications 2021, 57 (23): 154-162. DOI: 10.3778/j.issn.1002-8331.2104-0212

Abstract （1502）

PDF（pc）（844KB）（518）

Save

Aiming at the problem that traditional Convolutional Neural Network（CNN） and Long Short-Term Memory （LSTM） can not reflect the importance of each word in the text when extracting features, this paper proposes a multi-channel text classification model based on CNN and LSTM. Firstly, CNN and LSTM are used to extract the local information and context features of the text; secondly, multi-channel attention mechanism is used to extract the attention score of the output information of CNN and LSTM; finally, the output information of multi-channel attention mechanism is fused to achieve the effective extraction of text features and focus attention on important words. Experimental results on three public datasets show that the proposed model is better than CNN, LSTM and their improved models, and can effectively improve the effect of text classification.

Reference | Related Articles | Metrics

Select

Research Progress on Vision System and Manipulator of Fruit Picking Robot

GOU Yuanmin, YAN Jianwei, ZHANG Fugui, SUN Chengyu, XU Yong

Computer Engineering and Applications 2023, 59 (9): 13-26. DOI: 10.3778/j.issn.1002-8331.2209-0183

Abstract （1419）

PDF（pc）（787KB）（1016）

Save

Fruit picking robot is of great significance to the realization of automatic intelligence of fruit equipment. In this paper, the research work on the key technologies of fruit-picking robot at home and abroad in recent years is summarized, firstly, the key technologies of fruit-picking robot vision system, such as traditional image segmentation methods based on fruit features, such as threshold method, edge detection method, clustering algorithm based on color features and region-based image segmentation algorithm, are discussed, the object recognition algorithm based on depth learning and the target fruit location are analyzed and compared, and the state-of-the-art of fruit picking robot manipulator and end-effector is summarized, finally, the development trend and direction of fruit-picking robot in the future are prospected, which can provide reference for the related research of fruit-picking robot.

Reference | Related Articles | Metrics

Select

Research on Object Detection Algorithm Based on Improved YOLOv5

QIU Tianheng, WANG Ling, WANG Peng, BAI Yan’e

Computer Engineering and Applications 2022, 58 (13): 63-73. DOI: 10.3778/j.issn.1002-8331.2202-0093

Abstract （1394）

PDF（pc）（1109KB）（579）

Save

YOLOv5 is an algorithm with good performance in single-stage target detection at present, but the accuracy of target boundary regression is not too high, so it is difficult to apply to scenarios with high requirements on the intersection ratio of prediction boxes. Based on YOLOv5 algorithm, this paper proposes a new model YOLO-G with low hardware requirements, fast model convergence and high accuracy of target box. Firstly, the feature pyramid network（FPN） is improved, and more features are integrated in the way of cross-level connection, which prevents the loss of shallow semantic information to a certain extent. At the same time, the depth of the pyramid is deepened, corresponding to the increase of detection layer, so that the laying interval of various anchor frames is more reasonable. Secondly, the attention mechanism of parallel mode is integrated into the network structure, which gives the same priority to spatial and channel attention module, then the attention information is extracted by weighted fusion, so that the network can fuse the mixed domain attention according to the attention degree of spatial and channel attention. Finally, in order to prevent the loss of real-time performance due to the increase of model complexity, the network is lightened to reduce the number of parameters and computation of the network. PASCAL VOC datasets of 2007 and 2012 are used to verify the effectiveness of the algorithm. Compared with YOLOv5s, YOLO-G reduces the number of parameters by 4.7% and the amount of computation by 47.9%, while mAP@0.5 and mAP@0.5：0.95 increases by 3.1 and 5.6 percentage points respectively.

Reference | Related Articles | Metrics

Select

Research Progress of Transformer Based on Computer Vision

LIU Wenting, LU Xinming

Computer Engineering and Applications 2022, 58 (6): 1-16. DOI: 10.3778/j.issn.1002-8331.2106-0442

Abstract （1341）

PDF（pc）（1089KB）（856）

Save

Transformer is a deep neural network based on the self-attention mechanism and parallel processing data. In recent years, Transformer-based models have emerged as an important area of research for computer vision tasks. Aiming at the current blanks in domestic review articles based on Transformer, this paper covers its application in computer vision. This paper reviews the basic principles of the Transformer model, mainly focuses on the application of seven visual tasks such as image classification, object detection and segmentation, and analyzes Transformer-based models with significant effects. Finally, this paper summarizes the challenges and future development trends of the Transformer model in computer vision.

Reference | Related Articles | Metrics

Select

Overview of Multi-Agent Path Finding

LIU Zhifei, CAO Lei, LAI Jun, CHEN Xiliang, CHEN Ying

Computer Engineering and Applications 2022, 58 (20): 43-64. DOI: 10.3778/j.issn.1002-8331.2203-0467

Abstract （1336）

PDF（pc）（1013KB）（578）

Save

The multi-agent path finding（MAPF） problem is the fundamental problem of planning paths for multiple agents, where the key constraint is that the agents will be able to follow these paths concurrently without colliding with each other. MAPF is widely used in logistics, military, security and other fields. MAPF algorithm can be divided into the centralized planning algorithm and the distributed execution algorithm when the main research results of MAPF at home and abroad are systematically sorted and classified according to different planning methods. The centralized programming algorithm is not only the most classical but also the most commonly used MAPF algorithm. It is mainly divided into four algorithms based on [A*] search, conflict search, cost growth tree and protocol. The other part of MAPF which is the distributed execution algorithm is based on reinforcement learning. According to different improved techniques, the distributed execution algorithm can be divided into three types：the expert demonstration, the improved communication and the task decomposition. The challenges of existing algorithms are pointed out and the future work is forecasted based on the above classification by comparing the characteristics and applicability of MAPF algorithms and analyzing the advantages and disadvantages of existing algorithms.

Reference | Related Articles | Metrics

Select

YOLOv5 Helmet Wear Detection Method with Introduction of Attention Mechanism

WANG Lingmin, DUAN Jun, XIN Liwei

Computer Engineering and Applications 2022, 58 (9): 303-312. DOI: 10.3778/j.issn.1002-8331.2112-0242

Abstract （1332）

PDF（pc）（1381KB）（758）

Save

For high-risk industries such as steel manufacturing, coal mining and construction industries, wearing helmets during construction is one of effective ways to avoid injuries. For the current helmet wearing detection model in a complex environment for small and dense targets, there are problems such as false detection and missed detection, an improved YOLOv5 target detection method is proposed to detect the helmet wearing. A coordinate attention mechanism（coordinate attention） is added to the backbone network of YOLOv5, which embeds location information into channel attention so that the network can pay attention on a larger area. The original feature pyramid module in the feature fusion module is replaced with a weighted bi-directional feature pyramid（BiFPN）network structure to achieve efficient bi-directional cross-scale connectivity and weighted feature fusion. The experimental results on the homemade helmet dataset show that the improved YOLOv5 model achieves an average accuracy of 95.9%, which is 5.1 percentage points higher than the YOLOv5 model, and meets the requirements for small and dense target detection in complex environments.

Reference | Related Articles | Metrics

Select

Overview on Reinforcement Learning of Multi-agent Game

WANG Jun, CAO Lei, CHEN Xiliang, LAI Jun, ZHANG Legui

Computer Engineering and Applications 2021, 57 (21): 1-13. DOI: 10.3778/j.issn.1002-8331.2104-0432

Abstract （1200）

PDF（pc）（779KB）（1281）

Save

The use of deep reinforcement learning to solve single-agent tasks has made breakthrough progress. Since the complexity of multi-agent systems, common algorithms cannot solve the main difficulties. At the same time, due to the increase in the number of agents, taking the expected value of maximizing the cumulative return of a single agent as the learning goal often fails to converge and some special convergence points do not satisfy the rationality of the strategy. For practical problems that there is no optimal solution, the reinforcement learning algorithm is even more helpless. The introduction of game theory into reinforcement learning can solve the interrelationship of agents very well and explain the rationality of the strategy corresponding to the convergence point. More importantly, it can use the equilibrium solution to replace the optimal solution in order to obtain a relatively effective strategy. Therefore, this article investigates the reinforcement learning algorithms that have emerged in recent years from the perspective of game theory, summarizes the important and difficult points of current game reinforcement learning algorithms and gives several breakthrough directions that may solve the above-mentioned difficulties.

Reference | Related Articles | Metrics

Select

Review on Integration Analysis and Application of Multi-omics Data

ZHONG Yating, LIN Yanmei, CHEN Dingjia, PENG Yuzhong, ZENG Yuanpeng

Computer Engineering and Applications 2021, 57 (23): 1-17. DOI: 10.3778/j.issn.1002-8331.2106-0341

Abstract （1186）

PDF（pc）（806KB）（1702）

Save

With the continuous emergence and popularization of new omics sequencing technology, a large number of omics data have been produced, which is of great significance for people to further study and reveal the mysteries of life. Using multi-omics data to integrate and analyze life science problems can obtain more abundant and more comprehensive information related to life system, which has become a new direction for scientists to explore the mechanism of life. This paper introduces the research background and significance of multi-omics data integration analysis, summarizes the methods of data integration analysis of multiomics in recent years and the applied research in related fields, and finally discusses the current existing problems and future prospects of multi-omics data integration analysis methods.

Reference | Related Articles | Metrics

Select

Survey of Opponent Modeling Methods and Applications in Intelligent Game Confrontation

WEI Tingting, YUAN Weilin, LUO Junren, ZHANG Wanpeng

Computer Engineering and Applications 2022, 58 (9): 19-29. DOI: 10.3778/j.issn.1002-8331.2202-0297

Abstract （1117）

PDF（pc）（904KB）（428）

Save

Intelligent game confrontation has always been the focus of artificial intelligence research. In the game confrontation environment, the actions, goals, strategies, and other related attributes of agent can be inferred by opponent modeling, which provides key information for game strategy formulation. The application of opponent modeling method in competitive games and combat simulation is promising, and the formulation of game strategy must be premised on the action strategy of all parties in the game, so it is especially important to establish an accurate model of opponent behavior to predict its intention. From three dimensions of connotation, method, and application, the necessity of opponent modeling is expounded and the existing modeling methods are classified. The prediction method based on reinforcement learning, reasoning method based on theory of mind, and optimization method based on Bayesian are summarized. Taking the sequential game（Texas Hold’em）, real-time strategy game（StarCraft）, and meta-game as typical application scenarios, the role of opponent modeling in intelligent game confrontation is analyzed. Finally, the development of adversary modeling technology prospects from three aspects of bounded rationality, deception strategy and interpretability.

Reference | Related Articles | Metrics

Select

Review of Neural Style Transfer Models

TANG Renwei, LIU Qihe, TAN Hao

Computer Engineering and Applications 2021, 57 (19): 32-43. DOI: 10.3778/j.issn.1002-8331.2105-0296

Abstract （1099）

PDF（pc）（1078KB）（716）

Save

Neural Style Transfer（NST） technique is used to simulate different art styles of images and videos, which is a popular topic in computer vision. This paper aims to provide a comprehensive overview of the current progress towards NST. Firstly, the paper reviews the Non-Photorealistic Rendering（NPR） technique and traditional texture transfer. Then, the paper categorizes current major NST methods and gives a detailed description of these methods along with their subsequent improvements. After that, it discusses various applications of NST and presents several evaluation methods which compares different style transfer models both qualitatively and quantitatively. In the end, it summarizes the existing problems and provides some future research directions for NST.

Reference | Related Articles | Metrics

Select

Overview of Chinese Domain Named Entity Recognition

JIAO Kainan, LI Xin, ZHU Rongchen

Computer Engineering and Applications 2021, 57 (16): 1-15. DOI: 10.3778/j.issn.1002-8331.2103-0127

Abstract （1087）

PDF（pc）（928KB）（657）

Save

Named Entity Recognition（NER）, as a classic research topic in the field of natural language processing, is the basic technology of intelligent question answering, knowledge graph and other tasks. Domain Named Entity Recognition（DNER） is the domain-specific NER scheme. Drived by deep learning technology, Chinese DNER has made a breakthrough. Firstly, this paper summarizes the research framework of Chinese DNER, and reviews the existing research results from four aspects：the determination of domain data sources, the establishment of domain entity types and specifications, the annotation of domain data sets, and the evaluation metrics of Chinese DNER. Then, this paper summarizes the current common technology framework of Chinese DNER, introduces the pattern matching method based on dictionaries and rules, statistical machine learning method, deep learning method, multi-party fusion deep learning method, and focuses on the analysis of Chinese DNER method based on word vector representation and deep learning. Finally, the typical application scenarios of Chinese DNER are discussed, and the future development direction is prospected.

Related Articles | Metrics

Select

Research on Local Path Planning Algorithm Based on Improved TEB Algorithm

DAI Wanyu, ZHANG Lijuan, WU Jiafeng, MA Xianghua

Computer Engineering and Applications 2022, 58 (8): 283-288. DOI: 10.3778/j.issn.1002-8331.2108-0290

Abstract （1030）

PDF（pc）（878KB）（208）

Save

When the traditional TEB（time elastic band） algorithm is used to plan the path in a complex dynamic environment, path vibrations caused by the unsmooth speed control amount will occur, which will bring greater impact to the robot and prone to collisions. Aiming at the above problems, the traditional TEB algorithm is improved. The detected irregular obstacles are expansion treatment and regional classification strategy, and the driving route in the safe area is given priority to make the robot run more safely and smoothly in the complex environment. Adding the obstacle distance to the speed constraint in the algorithm can effectively reduce the vibration amplitude and the impact of the robot during the path driving process caused by the speed jump after the robot approaches the obstacle, so as to ensure the safety of the robot during operation. A large number of comparative simulations in the ROS environment show that in a complex dynamic environment, the path planned by the improved TEB algorithm is safer and smoother, which can effectively reduce the impact of the robot.

Reference | Related Articles | Metrics

Select

Robot Dynamic Path Planning Based on Improved A* and DWA Algorithm

LIU Jianjuan, XUE Liqi, ZHANG Huijuan, LIU Zhongpu

Computer Engineering and Applications 2021, 57 (15): 73-81. DOI: 10.3778/j.issn.1002-8331.2103-0525

Abstract （1025）

PDF（pc）（1452KB）（827）

Save

Traditional A* algorithm is one of the commonly used algorithms for global path planning of mobile robot, but the algorithm has low search efficiency, many turning points in planning path, and can’t achieve dynamic path planning in the face of random dynamic obstacles in complex environment. To solve these problems, the improved A* algorithm and DWA algorithm are integrated on the basis of global optimization. The obstacle information in the environment is quantified, and the weight of heuristic function of A* algorithm is adjusted according to the information to improve the efficiency and flexibility of the algorithm. Based on the Floyd algorithm, the optimization algorithm of path nodes is designed, which can delete redundant nodes, reduce turning points and improve the path smoothness. The dynamic window evaluation function of DWA algorithm is designed based on the global optimal, which is used to distinguish known obstacles from unknown dynamic and static obstacles, and the key points of the improved A* algorithm planning path are extracted as the temporary target points of DWA algorithm. On the basis of the global optimal, the fusion of the improved A* algorithm and DWA algorithm is realized. The experimental results show that, in the complex environment, the fusion algorithm can not only ensure the global optimal path planning, but also effectively avoid the dynamic and static obstacles in the environment, and realize the dynamic path planning in the complex environment.

Related Articles | Metrics

Select

Overview of Visual Multi-object Tracking Algorithms with Deep Learning

ZHANG Yao, LU Huanzhang, ZHANG Luping, HU Moufa

Computer Engineering and Applications 2021, 57 (13): 55-66. DOI: 10.3778/j.issn.1002-8331.2102-0260

Abstract （1018）

PDF（pc）（931KB）（866）

Save

Visual multi-object tracking is a hot issue in the field of computer vision. However, the uncertainty of the number of targets in the scene, the mutual occlusion between targets, and the difficulties of discrimination between target features has led to slow progress in the real-world application of visual multi-target tracking. In recent years, with the continuous in-depth research of visual intelligent processing, a variety of deep learning visual multi-object tracking algorithms have emerged. Based on the analysis of the challenges and difficulties faced by visual multi-object tracking, the algorithm is divided into Detection-Based Tracking（DBT） and Joint Detection Tracking（JDT） two categories and six sub-categories class, and studied about its advantages and disadvantages. The analysis shows that the DBT algorithm has a simple structure, but the correlation of each sub-step of the algorithm is not high. The JDT algorithm integrates multi-module joint learning and is dominant in multiple tracking evaluation indicators. The feature extraction module is the key to solve the target occlusion in the DBT algorithm with the expense of the speed of the algorithm, and the JDT algorithm is more dependent on the detection module. At present, multi-object tracking is generally developed from DBT-type algorithms to JDT, achieving a balance between algorithm accuracy and speed in stages. The future development direction of the multi-object tracking algorithm in terms of datasets, sub-modules, and specific scenarios is proposed.

Related Articles | Metrics

Select

Survey of Transformer-Based Object Detection Algorithms

LI Jian, DU Jianqiang, ZHU Yanchen, GUO Yongkun

Computer Engineering and Applications 2023, 59 (10): 48-64. DOI: 10.3778/j.issn.1002-8331.2211-0133

Abstract （1016）

PDF（pc）（875KB）（587）

Save

Transformer is a kind of deep learning framework with strong modeling and parallel computing capabilities. At present, object detection algorithm based on Transformer has become a hotspot. In order to further explore new ideas and directions, this paper summarizes the existing object detection algorithm based on Transformer as well as a variety of object detection data sets and their application scenarios. This paper describes the correlation algorithms for Transformer based object detection from four aspects, i.e. feature extraction, object estimation, label matching policy and application of algorithm, compares the Transformer algorithm with the object detection algorithm based on convolutional neural network, analyzes the advantages and disadvantages of Transformer in object detection task, and proposes a general framework for Transformer based object detection model. Finally, the prospect of development trend of Transformer in the field of object detection is put forward.

Reference | Related Articles | Metrics

Select

Research Progress of YOLO Series Target Detection Algorithms

WANG Linyi, BAI Jing, LI Wenjing, JIANG Jinzhe

Computer Engineering and Applications 2023, 59 (14): 15-29. DOI: 10.3778/j.issn.1002-8331.2301-0081

Abstract （1007）

PDF（pc）（1009KB）（591）

Save

The YOLO-based algorithm is one of the hot research directions in target detection. In recent years, with the continuous proposition of YOLO series algorithms and their improved models, the YOLO-based algorithm has achieved excellent results in the field of target detection and has been widely used in various fields in reality. This article first introduces the typical datasets and evaluation index for target detection and reviews the overall YOLO framework and the development of the target detection algorithm of YOLOv1~YOLOv7. Then, models and their performance are summarized across eight improvement directions, such as data augmentation, lightweight network construction, and IOU loss optimization, at the three stages of input, feature extraction, and prediction. Afterwards, the application fields of YOLO algorithm are introduced. Finally, combined with the actual problems of target detection, it summarizes and prospects the development direction of the YOLO-based algorithm.

Reference | Related Articles | Metrics

Select

Review of Research on Road Traffic Flow Data Prediciton Methods

MENG Chuang, WANG Hui, LIN Hao, LI Kecen, WANG Xinpeng

Computer Engineering and Applications 2023, 59 (14): 51-61. DOI: 10.3778/j.issn.1002-8331.2209-0458

Abstract （995）

PDF（pc）（605KB）（472）

Save

As an important branch of intelligent transportation system, road traffic flow prediction plays an important role in congestion prediction, path planning. The spatio-temporal polymorphism and complex correlation of road traffic flow data force the transformation and upgrading of road traffic flow prediction methods in the era of big data. In order to mine the time-space characteristics of traffic flow, scholars have proposed various methods, including model fusion, model algorithm improvement, data definition conversion, etc, in order to improve the prediction accuracy of the model. In order to reasonably summarize all kinds of traffic flow prediction methods, they are divided into three categories according to the types of methods used：statistics based methods, machine learning based methods, and depth learning based methods. This paper summarizes and analyzes the new models and algorithms in recent years by summarizing various traffic flow prediction methods, aiming to provide research ideas for relevant researchers. Finally, the methods of traffic flow prediction are summarized and prospected, and the exploration direction of the future traffic flow prediction field is given.

Reference | Related Articles | Metrics

Select

Survey on Zero-Shot Learning

WANG Zeshen，YANG Yun，XIANG Hongxin, LIU Qing

Computer Engineering and Applications 2021, 57 (19): 1-17. DOI: 10.3778/j.issn.1002-8331.2106-0133

Abstract （992）

PDF（pc）（1267KB）（605）

Save

Although there have been well developed in zero-shot learning since the development of deep learning, in the aspect of the application, zero-shot learning did not have a good system to order it. This paper overviews theoretical systems of zero-shot learning, typical models, application systems, present challenges and future research directions. Firstly, it introduces the theoretical systems from definition of zero-shot learning, essential problems, and commonly used data sets. Secondly, some typical models of zero-shot learning are described in chronological order. Thirdly, it presents the application systems about of zero-shot learning from the three dimensions, such as words, images and videos. Finally, the paper analyzes the challenges and future research directions in zero-shot learning.

Reference | Related Articles | Metrics

Select

Survey of Deep Clustering Algorithm Based on Autoencoder

TAO Wenbin, QIAN Yurong, ZHANG Yiyang, MA Hengzhi, LENG Hongyong, MA Mengnan

Computer Engineering and Applications 2022, 58 (18): 16-25. DOI: 10.3778/j.issn.1002-8331.2204-0049

Abstract （948）

PDF（pc）（724KB）（347）

Save

As a common analysis method, cluster analysis is widely used in various scenarios. With the development of machine learning technology, deep clustering algorithm has also become a hot research topic, and the deep clustering algorithm based on autoencoder is one of the representative algorithms. To keep abreast of the development of deep clustering algorithms based on autoencoders, four models of autoencoders are introduced, and the representative algorithms in recent years are classified according to the structure of autoencoders. For the traditional clustering algorithm and the deep clustering algorithm based on autoencoder, experiments are compared and analyzed on the MNIST, USPS, Fashion-MNIST datasets. At last, the current problems of deep clustering algorithms based on autoencoders are summarized, and the possible research directions of deep clustering algorithms are prospected.

Reference | Related Articles | Metrics

Select

Review of Path Planning Algorithms for Mobile Robots

LIN Hanxi, XIANG Dan, OUYANG Jian, LAN Xiaodong

Computer Engineering and Applications 2021, 57 (18): 38-48. DOI: 10.3778/j.issn.1002-8331.2103-0519

Abstract （944）

PDF（pc）（865KB）（524）

Save

Path planning is one of the hot research topics of mobile robot, and it is the key technology to realize autonomous navigation of robot. In this paper, the path planning algorithms of mobile robots are studied to understand the development and application of path planning algorithms under different environments, and the research status and development of path planning are systematically summarized. According to the characteristics of mobile robot path planning, it is divided into intelligent search algorithm, artificial intelligence-based algorithm, geometric model based algorithm and local obstacle avoidance algorithm. Based on the above classification, this paper introduces the representative research results in recent years, analyzes the advantages and disadvantages of various planning algorithms, and forecasts the future development trend of mobile robot path planning, which provides some ideas for the research of robot path planning.

Related Articles | Metrics

Select

Survey of Transformer Research in Computer Vision

LI Xiang, ZHANG Tao, ZHANG Zhe, WEI Hongyang, QIAN Yurong

Computer Engineering and Applications 2023, 59 (1): 1-14. DOI: 10.3778/j.issn.1002-8331.2204-0207

Abstract （937）

PDF（pc）（1285KB）（592）

Save

Transformer is a deep neural network based on self-attention mechanism. In recent years, Transformer-based models have become a hot research direction in the field of computer vision, and their structures are constantly being improved and expanded, such as local attention mechanisms, pyramid structures, and so on. Through the improved vision model based on Transformer structure, the performance optimization and structure improvement are reviewed and summarized respectively. In addition，the advantages and disadvantages of the respective structures of the Transformer and convolutional neural network（CNN） are compared and analyzed，and a new hybrid structure of CNN+Transformer is introduced. Finally，the development of Transformer in computer vision is summarized and prospected.

Reference | Related Articles | Metrics

Select

Review of Research on Small Target Detection Based on Deep Learning

ZHANG Yan, ZHANG Minglu, LYU Xiaoling, GUO Ce, JIANG Zhihong

Computer Engineering and Applications 2022, 58 (15): 1-17. DOI: 10.3778/j.issn.1002-8331.2112-0176

Abstract （906）

PDF（pc）（995KB）（473）

Save

The task of target detection is to quickly and accurately identify and locate predefined categories of objects from an image. With the development of deep learning techniques, detection algorithms have achieved good results for large and medium targets in the industry. The performance of small target detection algorithms based on deep learning still needs further improvement and optimization due to the characteristics of small targets in images such as small size, incomplete features and large gap between them and the background. Small target detection has a wide demand in many fields such as autonomous driving, medical diagnosis and UAV navigation, so the research has high application value. Based on extensive literature research, this paper firstly defines small target detection and finds the current difficulties in small target detection. It analyzes the current research status from six research directions based on these difficulties and summarizes the advantages and disadvantages of each algorithm. It makes reasonable predictions and outlooks on the future research directions in this field by combining the literature and the development status to provide a certain basic reference for subsequent research. This paper makes a reasonable prediction and outlook on the future research direction in this field, combining the literature and the development status to provide some basic reference for subsequent research.

Reference | Related Articles | Metrics

Select

Overview of Smoke and Fire Detection Algorithms Based on Deep Learning

ZHU Yuhua, SI Yiyi, LI Zhihui

Computer Engineering and Applications 2022, 58 (23): 1-11. DOI: 10.3778/j.issn.1002-8331.2206-0154

Abstract （893）

PDF（pc）（782KB）（472）

Save

Among various disasters, fire is one of the main disasters that most often and universally threaten public safety and social development. With the rapid development of economic construction and the increasing size of cities, the number of major fire hazards has increased dramatically. However, the widely used smoke sensor method of fire detection is vulnerable to factors such as distance, resulting in untimely detection. The introduction of video surveillance systems has provided new ideas to solve this problem. Traditional image processing algorithms based on video are earlier proposed methods, and the recent rapid development of machine vision and image processing technologies has resulted in a series of methods using deep learning techniques to automatically detect fires in video and images, which have very important practical applications in the field of fire safety. In order to comprehensively analyze the improvements and applications related to deep learning methods for fire detection, this paper first briefly introduces the fire detection process based on deep learning, and then focuses on a detailed comparative analysis of deep methods for fire detection in three granularities：classification, detection, and segmentation, and elaborates the relevant improvements taken by each class of algorithms for existing problems. Finally, the problems of fire detection at the present stage are summarized and future research directions are proposed.

Reference | Related Articles | Metrics

Select

Overview of Image Quality Assessment Method Based on Deep Learning

CAO Yudong, LIU Haiyan, JIA Xu, LI Xiaohui

Computer Engineering and Applications 2021, 57 (23): 27-36. DOI: 10.3778/j.issn.1002-8331.2106-0228

Abstract （882）

PDF（pc）（646KB）（906）

Save

Image quality evaluation is a measurement of the visual quality of an image or video. The researches on image quality evaluation algorithms in the past 10 years are reviewed. First, the measurement indicators of image quality evaluation algorithm and image quality evaluation datasets are introduced. Then, the different classification of image quality evaluation methods are analyzed, and image quality evaluation algorithms with deep learning technology are focused on, basic model of which is deep convolutional network, deep generative adversarial network and transformer. The performance of algorithms with deep learning is often higher than that of traditional image quality assessment algorithms. Subsequently, the principle of image quality assessment with deep learning is described in detail. A specific no-reference image quality evaluation algorithm based on deep generative adversarial network is introduced, which improves the reliability of simulated reference images through enhanced confrontation learning. Deep learning technology requires massive data support. Data enhancement methods are elaborated to improve the performance of the model. Finally, the future research trend of digital image quality evaluation is summarized.

Reference | Related Articles | Metrics

Select

Survey of Camera Pose Estimation Methods Based on Deep Learning

WANG Jing, JIN Yuchu, GUO Ping, HU Shaoyi

Computer Engineering and Applications 2023, 59 (7): 1-14. DOI: 10.3778/j.issn.1002-8331.2209-0280

Abstract （881）

PDF（pc）（702KB）（442）

Save

Camera pose estimation is a technology to accurately estimate the 6-DOF position and pose of camera in world coordinate system under known environment. It is a key technology in robotics and automatic driving. With the rapid development of deep learning, using deep learning to optimize camera pose estimation algorithm has become one of the current research hotspots. In order to master the current research status and trends of camera pose estimation algorithms, the mainstream algorithms based on deep learning are summarized. Firstly, the traditional camera pose estimation methods based on feature points is briefly introduced. Then, the camera pose estimation method based on deep learning is mainly introduced. According to the different core algorithms, the end-to-end camera pose estimation, scene coordinate regression, camera pose estimation based on retrieval, hierarchical structure, multi-information fusion and cross scenescamera pose estimation are elaborated and analyzed in detail. Finally, this paper summarizes the current research status, points out the challenges in the field of camera pose estimation based on in-depth performance analysis, and prospects the development trend of camera pose estimation.

Reference | Related Articles | Metrics

Select

Review of Deep Reinforcement Learning Model Research on Vehicle Routing Problems

YANG Xiaoxiao, KE Lin, CHEN Zhibin

Computer Engineering and Applications 2023, 59 (5): 1-13. DOI: 10.3778/j.issn.1002-8331.2210-0153

Abstract （880）

PDF（pc）（1036KB）（489）

Save

Vehicle routing problem（VRP） is a classic NP-hard problem, which is widely used in transportation, logistics and other fields. With the scale of problem and dynamic factor increasing, the traditional method of solving the VRP is challenged in computational speed and intelligence. In recent years, with the rapid development of artificial intelligence technology, in particular, the successful application of reinforcement learning in AlphaGo provides a new idea for solving routing problems. In view of this, this paper mainly summarizes the recent literature using deep reinforcement learning to solve VRP and its variants. Firstly, it reviews the relevant principles of DRL to solve VRP and sort out the key steps of DRL-based to solve VRP. Then it systematically classifies and summarizes the pointer network, graph neural network, Transformer and hybrid models four types of solving methods, meanwhile this paper also compares and analyzes the current DRL-based model performance in solving VRP and its variants. Finally, this paper sums up the challenge of DRL-based to solve VRP and future research directions.

Reference | Related Articles | Metrics

Select

Review of Visual Odometry Methods Based on Deep Learning

ZHI Henghui, YIN Chenyang, LI Huibin

Computer Engineering and Applications 2022, 58 (20): 1-15. DOI: 10.3778/j.issn.1002-8331.2203-0480

Abstract （866）

PDF（pc）（904KB）（467）

Save

Visual odometry（VO） is a common method to deal with the positioning of mobile devices equipped with vision sensors, and has been widely used in autonomous driving, mobile robots, AR/VR and other fields. Compared with traditional model-based methods, deep learning-based methods can learn efficient and robust feature representations from data without explicit computation, thereby improving their ability to handle challenging scenes such as illumination changes and less textures. In this paper, it first briefly reviews the model-based visual odometry methods, and then focuses on six aspects of deep learning-based visual odometry methods, including supervised learning methods, unsupervised learning methods, model-learning fusion methods, common datasets, evaluation metrics, and comparison of models and deep learning methods. Finally, existing problems and future development trends of deep learning-based visual odometry are discussed.

Reference | Related Articles | Metrics

Select

Application of Deep Reinforcement Learning Algorithm on Intelligent Military Decision System

KUANG Liqun, LI Siyuan, FENG Li, HAN Xie, XU Qingyu

Computer Engineering and Applications 2021, 57 (20): 271-278. DOI: 10.3778/j.issn.1002-8331.2104-0114

Abstract （849）

PDF（pc）（1223KB）（536）

Save

Deep reinforcement learning algorithm can well achieve discrete decision-making behavior, but it is difficult to apply to the highly complex and continuous modern battlefield situations, and the algorithm is difficult to converge in multi-agent environment. To solve these problems, an improved Deep Deterministic Policy Gradient（DDPG） algorithm is proposed, which introduces the experience replay technology based on priority and single training mode to improve the convergence speed of the algorithm; at the same time, an exploration strategy of mixed double noise is designed in the algorithm to realize complex and continuous military decision-making and control behavior. The intelligent military decision simulation platform based on the improved DDPG algorithm is developed by unity3D. The simulation environment of Blue Army Infantry attacking Red Army military base is built to simulate multi-agent combat training. The experimental results show that the algorithm can drive multiple combat agents to complete tactical maneuvers and achieve tactical behaviors, such as bypassing obstacles to reach the dominant area for shooting. The algorithm has faster convergence speed and better stability. It can get higher round rewards, and achieves the purpose of improving the efficiency of intelligent military decision-making.

Reference | Related Articles | Metrics

Select

Quick Semantic Segmentation Network Based on U-Net and MobileNet-V2

LAN Tianxiang, XIANG Ziyu, LIU Mingguo, CHEN Kai

Computer Engineering and Applications 2021, 57 (17): 175-180. DOI: 10.3778/j.issn.1002-8331.2005-0278

Abstract （848）

PDF（pc）（1156KB）（270）

Save

The U-Net model is large and accordingly has relatively slow speed on image processing. This drawback makes it difficult to satisfy the requirement of industrial real-time applications. On the consideration of the question above, this paper designs a light-weight full convolution neural network named LU-Net. In the proposed network, it integrates the thought of MobileNet-V2 into the U-Net framework. The depth separable convolution method in MobileNet-V2 is efficient to reduce the parameters and the computation complexity of the proposed network. The proposed network also reserves the advantages of normal convolution and bottleneck model. Accordingly, it is efficient to utilize the high-level features to keep the accuracy and reduce the processing time simultaneously. The experiments on hollow symbol dataset and DRIVE dataset indicate that, on the comparison with U-Net, the parameters of the proposed LU-Net is 0.59×106,which is 1.9% of the original model, and the processing speed is 5 times faster. Under the experimental environment, LU-Net takes only 25?ms to process a picture under the resolution of 360×270 size. LU-Net is a promising method for the industrial real-time applications on picture processing.

Related Articles | Metrics

Select

COVID-19 Medical Imaging Dataset and Research Progress

LIU Rui, DING Hui, SHANG Yuanyuan, SHAO Zhuhong, LIU Tie

Computer Engineering and Applications 2021, 57 (22): 15-27. DOI: 10.3778/j.issn.1002-8331.2106-0118

Abstract （845）

PDF（pc）（1013KB）（366）

Save

As imaging technology has been playing an important role in the diagnosis and evaluation of the new coronavirus（COVID-19）, COVID-19 related datasets have been successively published. But few review articles discuss COVID-19 image processing, especially in datasets. To this end, the new coronary pneumonia datasets and deep learning models are sorted and analyzed, through COVID-19-related journal papers, reports, and related open-source dataset websites, which include Computer Tomography（CT） image and X-rays（CXR）image datasets. At the same time, the characteristics of the medical images presented by these datasets are analyzed. This paper focuses on collating and describing open-source datasets related to COVID-19 medical imaging. In addition, some important segmentation and classification models that perform well on the related datasets are analyzed and compared. Finally, this paper discusses the future development trend on lung imaging technology.

Reference | Related Articles | Metrics

Select

Survey on Image Semantic Segmentation in Dilemma of Few-Shot

WEI Ting, LI Xinlei, LIU Hui

Computer Engineering and Applications 2023, 59 (2): 1-11. DOI: 10.3778/j.issn.1002-8331.2205-0496

Abstract （843）

PDF（pc）（4301KB）（619）

Save

In recent years, image semantic segmentation has developed rapidly due to the emergence of large-scale datasets. However, in practical applications, it is not easy to obtain large-scale, high-quality images, and image annotation also consumes a lot of manpower and time costs. In order to get rid of the dependence on the number of samples, few-shot semantic segmentation has gradually become a research hotspot. The current few-shot semantic segmentation methods mainly use the idea of meta-learning, which can be divided into three categories：based on the siamese neural network, based on the prototype network and based on the attention mechanism according to different model structures. Based on the current research, this paper introduces the development, advantages and disadvantages of various methods for few-shot semantic segmentation, as well as common datasets and experimental designs. On this basis, the application scenarios and future development directions are summarized.

Reference | Related Articles | Metrics

Select

Review on Semantic Segmentation of UAV Aerial Images

CHENG Qing, FAN Man, LI Yandong, ZHAO Yuan, LI Chenglong

Computer Engineering and Applications 2021, 57 (19): 57-69. DOI: 10.3778/j.issn.1002-8331.2105-0423

Abstract （834）

PDF（pc）（926KB）（609）

Save

With the rapid development of Unmanned Aerial Vehicle（UAV） technology, research institutions and industries have attached importance of UAV’s application. Optical images and videos are vital for the UAV to sense the environment, occupying an important position in UAV vision. As a hot spot of the current research of computer vision, semantic segmentation is widely investigated in the fields of unmanned driving and intelligent robot. Semantic segmentation of UAV aerial images is based on the UAV aerial image semantic segmentation technology to enable the UAV to work in complex scenes. First of all, a brief introduction to the semantic segmentation technology and the application development of UAV is given. Meanwhile, the relevant UAV aerial data sets, characteristics of aerial images and commonly used evaluation metrics for semantic segmentation are introduced. Secondly, according to the characteristics of UAV aerial images, it introduces the relevant semantic segmentation methods. In this section, analysis and comparison are made in three aspects including the small object detection, the real-time performance of the models and the multi-scale information integration. Finally, the related applications of semantic segmentation for UAV are reviewed, including line detection, the application of agriculture and building extraction, and analysis of the development trend and challenges in the future is made.

Reference | Related Articles | Metrics

Select

Overview of Image Edge Detection

XIAO Yang, ZHOU Jun

Computer Engineering and Applications 2023, 59 (5): 40-54. DOI: 10.3778/j.issn.1002-8331.2209-0122

Abstract （826）

PDF（pc）（921KB）（418）

Save

The task of edge detection is to identify pixels with significant brightness changes as target edges, which is a low-level problem in computer vision, and edge detection has important applications in object recognition and detection, object proposal generation, and image segmentation. Nowadays, edge detection has produced several types of methods, such as traditional gradient-based detection methods and deep learning-based edge detection algorithms and detection methods combined with emerging technologies. A finer classification of these methods provides researchers with a clearer understanding of the trends in edge detection. Firstly, the theoretical basis and implementation methods of traditional edge detection are introduced; then the main edge detection methods in recent years are summarized and classified according to the methods used, and the core techniques used in them are introduced, such as branching structure, feature fusion and loss function. The evaluation indicators used to assess the algorithm’s performance are single-image optimal threshold（ODS） and frame per second（FPS）, which are contrasted using the fundamental data set（BSDS500）. Finally, the current state of edge detection research is examined and summarized, and the possible future research directions of edge detection are prospected.

Reference | Related Articles | Metrics

Select

Review of Research on Driver Fatigue Driving Detection Methods

ZHANG Rui, ZHU Tianjun, ZOU Zhiliang, SONG Rui

Computer Engineering and Applications 2022, 58 (21): 53-66. DOI: 10.3778/j.issn.1002-8331.2204-0053

Abstract （818）

PDF（pc）（946KB）（371）

Save

The proportion of traffic accidents caused by fatigue driving has increased year by year, which has attracted widespread attention from researchers. At present, the research of fatigue driving testing is limited by various factors such as scientific and technological level, environment, and road, which makes it difficult to further develop fatigue driving detection technology. This article introduces the latest progress in driver fatigue driving detection methods in the past decade. The two categories of active detection method and passive detection method are elaborated and reviewed. According to the different characteristics of the two major types of detection methods, it is carefully classified. The advantages and limitations of various fatigue driving detection methods are further analyzed, and the detection algorithms used in the active detection method based on facial features in the past three years are analyzed and summarized. Finally, the shortcomings of various fatigue driving detection methods are summarized, and the future research trends in the field of fatigue detection are proposed, which provides new ideas for researchers to further research.

Reference | Related Articles | Metrics

Most Read articles