Most Download articles

    Published in last 1 year | In last 2 years| In last 3 years| All| Most Downloaded in Recent Month | Most Downloaded in Recent Year|

    In last 2 years
    Please wait a minute...
    For Selected: Toggle Thumbnails
    Review of Research on Generative Adversarial Networks and Its Application
    WEI Fuqiang, Gulanbaier Tuerhong, Mairidan Wushouer
    Computer Engineering and Applications    2021, 57 (19): 18-31.   DOI: 10.3778/j.issn.1002-8331.2104-0248
    Abstract516)      PDF(pc) (1078KB)(1081)       Save

    The theoretical research and applications of generative adversarial networks have been continuously successful and have become one of the current hot spots of research in the field of deep learning. This paper provides a systematic review of the theory of generative adversarial networks and their applications in terms of types of models, evaluation criteria and theoretical research progress; analyzing the strengths and weaknesses of generative models with explicit and implicit density-based, respectively; summarizing the evaluation criteria of generative adversarial networks, interpreting the relationship between the criteria, and introduces the research progress of the generative adversarial network in image generation from the application level, that is, through the image conversion, image generation, image restoration, video generation, text generation and image super-resolution applications; analyzing the theoretical research progress of generative adversarial networks from the perspectives of interpretability, controllability, stability and model evaluation methods. Finally, the paper discusses the challenges of studying generative adversarial networks and looks forward to the possible future directions of development.

    Reference | Related Articles | Metrics
    Review of Development and Application of Artificial Neural Network Models
    ZHANG Chi, GUO Yuan, LI Ming
    Computer Engineering and Applications    2021, 57 (11): 57-69.   DOI: 10.3778/j.issn.1002-8331.2102-0256
    Abstract859)      PDF(pc) (781KB)(1023)       Save

    Artificial neural networks are increasingly closely related to other subject areas. People solve problems in various fields by exploring and improving the layer structure of artificial neural networks. Based on the analysis of artificial neural networks related literature, this paper summarizes the history of artificial neural network growth and presents relevant principles of artificial neural networks based on the development of neural networks, including multilayer perceptron, back-propagation algorithm, convolutional neural network and recurrent neural network, explains the classic convolutional neural network model in the development of the convolutional neural network and the widely used variant network structure in the recurrent neural network, reviews the application of each artificial neural network algorithm in related fields, summarizes the possible direction of development of the artificial neural network.

    Related Articles | Metrics
    Survey of Network Traffic Forecast Based on Deep Learning
    KANG Mengxuan, SONG Junping, FAN Pengfei, GAO Bowen, ZHOU Xu, LI Zhuo
    Computer Engineering and Applications    2021, 57 (10): 1-9.   DOI: 10.3778/j.issn.1002-8331.2101-0402
    Abstract1180)      PDF(pc) (711KB)(992)       Save

    Precisely predicting the trend of network traffic changes can help operators accurately predict network usage, correctly allocate and efficiently use network resources to meet the growing and diverse user needs. Taking the progress of deep learning algorithms in the field of network traffic prediction as a clue, this paper firstly elaborates the evaluation indicators of network traffic prediction and the current public network traffic data sets. Secondly, this paper specifically analyzes four deep learning methods commonly used in network traffic prediction:deep belief networks, convolutional neural network, recurrent neural network, and long short term memory network, and focuses on the integrated neural network models used in recent years for different problems. The characteristics and application scenarios of each model are summarized. Finally, the future development of network traffic forecast is prospected.

    Related Articles | Metrics
    Survey of Interpretability Research on Deep Learning Models
    ZENG Chunyan, YAN Kang, WANG Zhifeng, YU Yan, JI Chunmei
    Computer Engineering and Applications    2021, 57 (8): 1-9.   DOI: 10.3778/j.issn.1002-8331.2012-0357
    Abstract519)      PDF(pc) (677KB)(844)       Save

    With the characteristics of data-driven learning, deep learning technology has made great achievements in the fields of natural language processing, image processing, and speech recognition. However, due to the deep learning model featured by deep networks, many parameters, high complexity and other characteristics, the decisions and intermediate processes made by the model are difficult for humans to understand. Therefore, exploring the interpretability of deep learning has become a new topic in the current artificial intelligence field. This review takes the interpretability of deep learning models as the research object and summarizes its progress. Firstly, the main interpretability methods are summarized and analyzed from four aspects:self-explanatory model, model-specific explanation, model-agnostic explanation, and causal interpretability. At the same time, it enumerates the application of interpretability related technologies, and finally discusses the existing problems of current interpretability research to promote the further development of the deep learning interpretability research framework.

    Related Articles | Metrics
    Research Progress of Multi-label Text Classification
    HAO Chao, QIU Hangping, SUN Yi, ZHANG Chaoran
    Computer Engineering and Applications    2021, 57 (10): 48-56.   DOI: 10.3778/j.issn.1002-8331.2101-0096
    Abstract1002)      PDF(pc) (906KB)(826)       Save

    As a basic task in natural language processing, text classification has been studied in the 1950s. Now the single-label text classification algorithm has matured, but there is still a lot of improvement on multi-label text classification. Firstly, the basic concepts and basic processes of multi-label text classification are introduced, including data set acquisition, text preprocessing, model training and prediction results. Secondly, the methods of multi-label text classification are introduced. These methods are mainly divided into two categories:traditional machine learning methods and the methods based on deep learning. Traditional machine learning methods mainly include problem transformation methods and algorithm adaptation methods. The methods based on deep learning use various neural network models to handle multi-label text classification problems. According to the model structure, they are divided into multi-label text classification methods based on CNN structure, RNN structure and Transfomer structure. The data sets commonly used in multi-label text classification are summarized. Finally, the future development trend is summarized and analyzed.

    Related Articles | Metrics
    Overview of Research on 3D Human Pose Estimation
    WANG Faming, LI Jianwei, CHEN Sixi
    Computer Engineering and Applications    2021, 57 (10): 26-38.   DOI: 10.3778/j.issn.1002-8331.2102-0039
    Abstract516)      PDF(pc) (1035KB)(798)       Save

    The 3D human pose estimation is essentially a classification and regression problem. It mainly estimates the 3D human pose from images. The 3D human pose estimation based on traditional methods and deep learning methods is the mainstream research method in this field. This paper follows the traditional methods to the deep learning methods to systematically introduce the 3D human posture estimation methods in recent years, and basically understands the traditional methods to obtain the many elements of the human posture through the generation and discrimination methods to complete the 3D human posture estimation. The 3D human pose estimation method based on deep learning mainly regresses the human pose information from the image features by constructing a neural network. It can be roughly divided into three categories:based on direct regression methods, based on 2D information methods, and based on hybrid methods. In the end, it summarizes the current research difficulties and challenges, and discusses the research trends.

    Related Articles | Metrics
    Review of Attention Mechanism in Convolutional Neural Networks
    ZHANG Chenjia, ZHU Lei, YU Lu
    Computer Engineering and Applications    2021, 57 (20): 64-72.   DOI: 10.3778/j.issn.1002-8331.2105-0135
    Abstract1081)      PDF(pc) (973KB)(703)       Save

    Attention mechanism is widely used in deep learning tasks because of its excellent effect and plug and play convenience. This paper mainly focuses on convolution neural network, introduces various mainstream methods in the development process of convolution network attention mechanism, extracts and summarizes its core idea and implementation process, realizes each attention mechanism method, and makes comparative experiments and results analysis on the measured data of the same type of emitter equipment. According to the main ideas and experimental results, the research status and future development direction of attention mechanism in convolutional networks are summarized.

    Reference | Related Articles | Metrics
    Review of Text Sentiment Analysis Methods
    WANG Ting, YANG Wenzhong
    Computer Engineering and Applications    2021, 57 (12): 11-24.   DOI: 10.3778/j.issn.1002-8331.2101-0022
    Abstract585)      PDF(pc) (906KB)(693)       Save

    Text sentiment analysis is an important branch of natural language processing, which is widely used in public opinion analysis and content recommendation. It is also a hot topic in recent years. According to different methods used, it is divided into sentiment analysis based on emotional dictionary, sentiment analysis based on traditional machine learning, and sentiment analysis based on deep learning. Through comparing these three methods, the research results are analyzed, and the paper summarizes the advantages and disadvantages of different methods, introduces the related data sets and evaluation index, and application scenario, analysis of emotional subtasks is simple summarized. The future research trend and application field of sentiment analysis problem are found. Certain help and guidance are provided for the researchers in the related areas.

    Related Articles | Metrics
    Review of Application of Transfer Learning in Medical Image Field
    GAO Shuang, XU Qiaozhi
    Computer Engineering and Applications    2021, 57 (24): 39-50.   DOI: 10.3778/j.issn.1002-8331.2107-0300
    Abstract525)      PDF(pc) (896KB)(691)       Save

    Deep learning technology has developed rapidly and achieved significant results in the field of medical image treatment. However, due to the small number of medical image samples and difficult annotation, the effect of deep learning is far from reaching the expectation. In recent years, using transfer learning method to alleviate the problem of insufficient medical image samples and improve the effect of deep learning technology in the field of medical image has become one of the research hotspots. This paper first introduces the basic concepts, types, common strategies and models of transfer learning methods, then combs and summarizes the representative related research in the field of medical images according to the types of transfer learning methods, and finally summarizes and prospects the future development of this field.

    Reference | Related Articles | Metrics
    Mask Detection Algorithm Based on Improved YOLO Lightweight Network
    WANG Bing, LE Hongxia, LI Wenjing, ZHANG Menghan
    Computer Engineering and Applications    2021, 57 (8): 62-69.   DOI: 10.3778/j.issn.1002-8331.2009-0356
    Abstract408)      PDF(pc) (1108KB)(617)       Save

    Aiming at the problem of insufficient feature extraction and low feature utilization in mask wearing detection tasks in the current YOLO lightweight network, a lightweight network algorithm based on improved YOLOv4-tiny is proposed. It increases the Max Module structure to obtain more main features of the target and improves the detection accuracy. A bottom-up multi-scale fusion is proposed, which combines low-level information to enrich the feature level of the network to improve feature utilization. It uses CIoU as the bounding box regression loss function to speed up model convergence. Compared with the original algorithm, in the public data set PASCAL VOC and mask wearing detection tasks, mAP is increased by 4.9 percentage points and 3.3 percentage points, respectively, and the detection rate reaches 74 frame/s and 64 frame/s, respectively, which meets the accuracy and real-time performance of mask wearing detection tasks.

    Related Articles | Metrics
    Survey of Multimodal Data Fusion
    REN Zeyu, WANG Zhenchao, KE Zunwang, LI Zhe, Wushour·Silamu
    Computer Engineering and Applications    2021, 57 (18): 49-64.   DOI: 10.3778/j.issn.1002-8331.2104-0237
    Abstract297)      PDF(pc) (1214KB)(605)       Save

    With the rapid development of information technology, information exists in various forms and sources. Different forms of existence or information sources can be referred to as one modal, and data composed of two or more modalities is called multi-modal data. Multi-modal data fusion is responsible for effectively integrating the information of multiple modalities, absorbing the advantages of different modalities, and completing the integration of information. Natural phenomena have very rich characteristics, and it is difficult for a single mode to provide complete information about a certain phenomenon. Faced with the fusion requirements of maintaining the diversity and completeness of the modal information after fusion, maximizing the advantages of each modal, and reducing the information loss caused by the fusion process, how to integrate the information of each modal has become a new challenge that exists in many fields. This paper briefly describes common multimodal fusion methods and fusion architectures, summarizes three common fusion models, and briefly analyzes the advantages and disadvantages of the three architectures of collaboration, joint, and codec, as well as specific fusion methods such as multi-core learning and image models. In the application of multi-modality, it analyzes and summarizes multi-modal video clip retrieval, comprehensive multi-modal information generation content summary, multi-modal sentiment analysis, and multi-modal man-machine dialogue system. The paper also proposes the current problems of multi-modal fusion and the future research directions.

    Related Articles | Metrics
    Overview of Deep Learning Speech Synthesis Technology
    ZHANG Xiaofeng, XIE Jun, LUO Jianxin, YANG Tao
    Computer Engineering and Applications    2021, 57 (9): 50-59.   DOI: 10.3778/j.issn.1002-8331.2101-0044
    Abstract338)      PDF(pc) (879KB)(593)       Save

    Speech synthesis technology plays an important role in human-machine interaction. The development of deep learning drives the rapid development of speech synthesis technology. Speech synthesis technology based on deep learning surpasses traditional speech synthesis technology in both quality and speed. This paper reviews speech synthesis technology based on deep learning vocoders and acoustic models, discusses the working principles and advantages and disadvantages of various vocoders and acoustic models, and then summarizes the speech synthesis system, systematically reviews the classic speech synthesis system based on deep learning, and finally looks forward to the speech synthesis technology based on deep learning.

    Related Articles | Metrics
    Overview on Reinforcement Learning of Multi-agent Game
    WANG Jun, CAO Lei, CHEN Xiliang, LAI Jun, ZHANG Legui
    Computer Engineering and Applications    2021, 57 (21): 1-13.   DOI: 10.3778/j.issn.1002-8331.2104-0432
    Abstract463)      PDF(pc) (779KB)(592)       Save

    The use of deep reinforcement learning to solve single-agent tasks has made breakthrough progress. Since the complexity of multi-agent systems, common algorithms cannot solve the main difficulties. At the same time, due to the increase in the number of agents, taking the expected value of maximizing the cumulative return of a single agent as the learning goal often fails to converge and some special convergence points do not satisfy the rationality of the strategy. For practical problems that there is no optimal solution, the reinforcement learning algorithm is even more helpless. The introduction of game theory into reinforcement learning can solve the interrelationship of agents very well and explain the rationality of the strategy corresponding to the convergence point. More importantly, it can use the equilibrium solution to replace the optimal solution in order to obtain a relatively effective strategy. Therefore, this article investigates the reinforcement learning algorithms that have emerged in recent years from the perspective of game theory, summarizes the important and difficult points of current game reinforcement learning algorithms and gives several breakthrough directions that may solve the above-mentioned difficulties.

    Reference | Related Articles | Metrics
    Review on Integration Analysis and Application of Multi-omics Data
    ZHONG Yating, LIN Yanmei, CHEN Dingjia, PENG Yuzhong, ZENG Yuanpeng
    Computer Engineering and Applications    2021, 57 (23): 1-17.   DOI: 10.3778/j.issn.1002-8331.2106-0341
    Abstract275)      PDF(pc) (806KB)(578)       Save

    With the continuous emergence and popularization of new omics sequencing technology, a large number of omics data have been produced, which is of great significance for people to further study and reveal the mysteries of life. Using multi-omics data to integrate and analyze life science problems can obtain more abundant and more comprehensive information related to life system, which has become a new direction for scientists to explore the mechanism of life. This paper introduces the research background and significance of multi-omics data integration analysis, summarizes the methods of data integration analysis of multiomics in recent years and the applied research in related fields, and finally discusses the current existing problems and future prospects of multi-omics data integration analysis methods.

    Reference | Related Articles | Metrics
    YOLOv5 Helmet Wear Detection Method with Introduction of Attention Mechanism
    WANG Lingmin, DUAN Jun, XIN Liwei
    Computer Engineering and Applications    2022, 58 (9): 303-312.   DOI: 10.3778/j.issn.1002-8331.2112-0242
    Abstract905)      PDF(pc) (1381KB)(577)       Save
    For high-risk industries such as steel manufacturing, coal mining and construction industries, wearing helmets during construction is one of effective ways to avoid injuries. For the current helmet wearing detection model in a complex environment for small and dense targets, there are problems such as false detection and missed detection, an improved YOLOv5 target detection method is proposed to detect the helmet wearing. A coordinate attention mechanism(coordinate attention) is added to the backbone network of YOLOv5, which embeds location information into channel attention so that the network can pay attention on a larger area. The original feature pyramid module in the feature fusion module is replaced with a weighted bi-directional feature pyramid(BiFPN)network structure to achieve efficient bi-directional cross-scale connectivity and weighted feature fusion. The experimental results on the homemade helmet dataset show that the improved YOLOv5 model achieves an average accuracy of 95.9%, which is 5.1 percentage points higher than the YOLOv5 model, and meets the requirements for small and dense target detection in complex environments.
    Reference | Related Articles | Metrics
    Research Progress of Transformer Based on Computer Vision
    LIU Wenting, LU Xinming
    Computer Engineering and Applications    2022, 58 (6): 1-16.   DOI: 10.3778/j.issn.1002-8331.2106-0442
    Abstract708)      PDF(pc) (1089KB)(574)       Save
    Transformer is a deep neural network based on the self-attention mechanism and parallel processing data. In recent years, Transformer-based models have emerged as an important area of research for computer vision tasks. Aiming at the current blanks in domestic review articles based on Transformer, this paper covers its application in computer vision. This paper reviews the basic principles of the Transformer model, mainly focuses on the application of seven visual tasks such as image classification, object detection and segmentation, and analyzes Transformer-based models with significant effects. Finally, this paper summarizes the challenges and future development trends of the Transformer model in computer vision.
    Reference | Related Articles | Metrics
    Review of Typical Object Detection Algorithms for Deep Learning
    XU Degang, WANG Lu, LI Fan
    Computer Engineering and Applications    2021, 57 (8): 10-25.   DOI: 10.3778/j.issn.1002-8331.2012-0449
    Abstract652)      PDF(pc) (736KB)(568)       Save

    Object detection is an important research direction of computer vision, its purpose is to accurately identify the category and location of a specific target object in a given image. In recent years, the feature learning and transfer learning capabilities of deep convolutional neural networks have made significant progress in target detection algorithm feature extraction, image expression, classification and recognition. This paper introduces the research progress of target detection algorithm based on deep learning, the characteristics of common data sets and the key parameters of performance index evaluation, compares and analyzes the network structure and implementation mode of target detection algorithm formed by two-stage, single-stage and other improved algorithms. Finally, the application progress of the algorithm in the detection of human faces, salient targets, pedestrians, remote sensing images, medical images, and grain insects is described. Combined with the current problems and challenges, the future research directions are analyzed.

    Related Articles | Metrics
    Research Progress of Medical Image Registration Technology Based on Deep Learning
    GUO Yanfen, CUI Zhe, YANG Zhipeng, PENG Jing, HU Jinrong
    Computer Engineering and Applications    2021, 57 (15): 1-8.   DOI: 10.3778/j.issn.1002-8331.2101-0281
    Abstract593)      PDF(pc) (681KB)(561)       Save

    Medical image registration technology has a wide range of application values for lesion detection, clinical diagnosis, surgical planning, and efficacy evaluation. This paper systematically summarizes the registration algorithm based on deep learning, and analyzes the advantages and limitations of various methods from deep iteration, full supervision, weak supervision to unsupervised learning. In general, unsupervised learning has become the mainstream direction of medical image registration research, because it does not rely on golden standards and uses an end-to-end network to save time. Meanwhile, compared with other methods, unsupervised learning can achieve higher accuracy and spends shorter time. However, medical image registration methods based on unsupervised learning also face some research difficulties and challenges in terms of interpretability, cross-modal diversity, and repeatable scalability in the field of medical images, which points out the research direction for achieving more accurate medical image registration methods in the future.

    Related Articles | Metrics
    Progress on Deep Reinforcement Learning in Path Planning
    ZHANG Rongxia, WU Changxu, SUN Tongchao, ZHAO Zengshun
    Computer Engineering and Applications    2021, 57 (19): 44-56.   DOI: 10.3778/j.issn.1002-8331.2104-0369
    Abstract1188)      PDF(pc) (1134KB)(557)       Save

    The purpose of path planning is to allow the robot to avoid obstacles and quickly plan the shortest path during the movement. Having analyzed the advantages and disadvantages of the reinforcement learning based path planning algorithm, the paper derives a typical deep reinforcement learning, Deep Q-learning Network(DQN) algorithm that can perform excellent path planning in a complex dynamic environment. Firstly, the basic principles and limitations of the DQN algorithm are analyzed in depth, and the advantages and disadvantages of various DQN variant algorithms are compared from four aspects:the training algorithm, the neural network structure, the learning mechanism and AC(Actor-Critic) framework. The paper puts forward the current challenges and problems to be solved in the path planning method based on deep reinforcement learning. The future development directions are proposed, which can provide reference for the development of intelligent path planning and autonomous driving.

    Reference | Related Articles | Metrics
    Review of Neural Style Transfer Models
    TANG Renwei, LIU Qihe, TAN Hao
    Computer Engineering and Applications    2021, 57 (19): 32-43.   DOI: 10.3778/j.issn.1002-8331.2105-0296
    Abstract733)      PDF(pc) (1078KB)(525)       Save

    Neural Style Transfer(NST) technique is used to simulate different art styles of images and videos, which is a popular topic in computer vision. This paper aims to provide a comprehensive overview of the current progress towards NST. Firstly, the paper reviews the Non-Photorealistic Rendering(NPR) technique and traditional texture transfer. Then, the paper categorizes current major NST methods and gives a detailed description of these methods along with their subsequent improvements. After that, it discusses various applications of NST and presents several evaluation methods which compares different style transfer models both qualitatively and quantitatively. In the end, it summarizes the existing problems and provides some future research directions for NST.

    Reference | Related Articles | Metrics
    Robot Dynamic Path Planning Based on Improved A* and DWA Algorithm
    LIU Jianjuan, XUE Liqi, ZHANG Huijuan, LIU Zhongpu
    Computer Engineering and Applications    2021, 57 (15): 73-81.   DOI: 10.3778/j.issn.1002-8331.2103-0525
    Abstract357)      PDF(pc) (1452KB)(514)       Save

    Traditional A* algorithm is one of the commonly used algorithms for global path planning of mobile robot, but the algorithm has low search efficiency, many turning points in planning path, and can’t achieve dynamic path planning in the face of random dynamic obstacles in complex environment. To solve these problems, the improved A* algorithm and DWA algorithm are integrated on the basis of global optimization. The obstacle information in the environment is quantified, and the weight of heuristic function of A* algorithm is adjusted according to the information to improve the efficiency and flexibility of the algorithm. Based on the Floyd algorithm, the optimization algorithm of path nodes is designed, which can delete redundant nodes, reduce turning points and improve the path smoothness. The dynamic window evaluation function of DWA algorithm is designed based on the global optimal, which is used to distinguish known obstacles from unknown dynamic and static obstacles, and the key points of the improved A* algorithm planning path are extracted as the temporary target points of DWA algorithm. On the basis of the global optimal, the fusion of the improved A* algorithm and DWA algorithm is realized. The experimental results show that, in the complex environment, the fusion algorithm can not only ensure the global optimal path planning, but also effectively avoid the dynamic and static obstacles in the environment, and realize the dynamic path planning in the complex environment.

    Related Articles | Metrics
    Review of Deep Learning Based Physiological Abnormality Detection Research
    MA Chenbin, ZHANG Zhengbo, WANG Jing
    Computer Engineering and Applications    2021, 57 (10): 10-25.   DOI: 10.3778/j.issn.1002-8331.2101-0514
    Abstract306)      PDF(pc) (1372KB)(506)       Save

    Physiological signals usually cover useful information such as bioelectrical activity, temperature and pressure of the body, monitoring their numerical fluctuations can help to detect or warn the risk of clinical events in advance. Deep models are hierarchical machine learning models containing multi-level nonlinear transformations, which have significant advantages in feature extraction and modeling, and have great application prospects in the field of computer-aided diagnosis. With the advancement of continuous physiological parameter monitoring technology, the utility of deep models in the detection of physiological electrical signal abnormalities has gradually increased and the research focus has expanded to clinical applications. This paper reviews the research progress of depth models in physiological electrical signal abnormality detection. Firstly, the advantages and shortcomings of classical signal abnormality detection methods are analyzes from the perspective of clinical applications, and the current modeling approaches of depth models are described briefly. Then, the modeling principles and latest applications of classical models are summarized from the perspective of discriminative and generative models, while the training architecture and training strategies of deep models are discussed. Finally, this paper summarizes and discusses the three aspects of abnormality detection in clinical applications, the research progress of deep models and the availability of physiological datasets, and provides an outlook on future research.

    Related Articles | Metrics
    Overview of Chinese Domain Named Entity Recognition
    JIAO Kainan, LI Xin, ZHU Rongchen
    Computer Engineering and Applications    2021, 57 (16): 1-15.   DOI: 10.3778/j.issn.1002-8331.2103-0127
    Abstract725)      PDF(pc) (928KB)(505)       Save

    Named Entity Recognition(NER), as a classic research topic in the field of natural language processing, is the basic technology of intelligent question answering, knowledge graph and other tasks. Domain Named Entity Recognition(DNER) is the domain-specific NER scheme. Drived by deep learning technology, Chinese DNER has made a breakthrough. Firstly, this paper summarizes the research framework of Chinese DNER, and reviews the existing research results from four aspects:the determination of domain data sources, the establishment of domain entity types and specifications, the annotation of domain data sets, and the evaluation metrics of Chinese DNER. Then, this paper summarizes the current common technology framework of Chinese DNER, introduces the pattern matching method based on dictionaries and rules, statistical machine learning method, deep learning method, multi-party fusion deep learning method, and focuses on the analysis of Chinese DNER method based on word vector representation and deep learning. Finally, the typical application scenarios of Chinese DNER are discussed, and the future development direction is prospected.

    Related Articles | Metrics
    Overview of Visual Multi-object Tracking Algorithms with Deep Learning
    ZHANG Yao, LU Huanzhang, ZHANG Luping, HU Moufa
    Computer Engineering and Applications    2021, 57 (13): 55-66.   DOI: 10.3778/j.issn.1002-8331.2102-0260
    Abstract515)      PDF(pc) (931KB)(484)       Save

    Visual multi-object tracking is a hot issue in the field of computer vision. However, the uncertainty of the number of targets in the scene, the mutual occlusion between targets, and the difficulties of discrimination between target features has led to slow progress in the real-world application of visual multi-target tracking. In recent years, with the continuous in-depth research of visual intelligent processing, a variety of deep learning visual multi-object tracking algorithms have emerged. Based on the analysis of the challenges and difficulties faced by visual multi-object tracking, the algorithm is divided into Detection-Based Tracking(DBT) and Joint Detection Tracking(JDT) two categories and six sub-categories class, and studied about its advantages and disadvantages. The analysis shows that the DBT algorithm has a simple structure, but the correlation of each sub-step of the algorithm is not high. The JDT algorithm integrates multi-module joint learning and is dominant in multiple tracking evaluation indicators. The feature extraction module is the key to solve the target occlusion in the DBT algorithm with the expense of the speed of the algorithm, and the JDT algorithm is more dependent on the detection module. At present, multi-object tracking is generally developed from DBT-type algorithms to JDT, achieving a balance between algorithm accuracy and speed in stages. The future development direction of the multi-object tracking algorithm in terms of datasets, sub-modules, and specific scenarios is proposed.

    Related Articles | Metrics
    Eye Movement and Tracking Data Fusion Algorithm Based on Deep Learning
    ZHAO Yi, GAO Shuping, HE Di
    Computer Engineering and Applications    2021, 57 (10): 211-217.   DOI: 10.3778/j.issn.1002-8331.2002-0191
    Abstract338)      PDF(pc) (1292KB)(451)       Save

    For traditional data fusion algorithms, the fusion effect of eye movement and tracking data in multiple scenarios is poor. This paper proposes a new eye movement and tracking data fusion algorithm based on deep learning, namely Eye-CNN BLSTM algorithm. Firstly, the algorithm adds new artificial features based on the spatial position information of the original eye movement and tracking data. Secondly, CNN(Convolutional Neural Network) and BLSTM(Bi-directional Long Short-Term Memory) are combined to design a new fusion structure. Finally, the experimental results show that compared with six classic data fusion algorithms, the fusion performance of the proposed algorithm is better on OTB-100 dataset.

    Related Articles | Metrics
    Improved Lightweight Attention Model Based on CBAM
    FU Guodong, HUANG Jin, YANG Tao, ZHENG Siyu
    Computer Engineering and Applications    2021, 57 (20): 150-156.   DOI: 10.3778/j.issn.1002-8331.2101-0369
    Abstract1222)      PDF(pc) (808KB)(432)       Save

    In recent years, the attention model has been widely used in the field of computer vision. By adding the attention module to the convolutional neural network, the performance of the network can be significantly improved. However, most of the existing methods focus on the development of more complex attention modules to enable the convolutional neural network to obtain stronger feature expression capabilities, but this also inevitably increases the complexity of the model. In order to achieve a balance between performance and complexity, a lightweight EAM(Efficient Attention Module) model is proposed to optimize the CBAM model. For the channel attention module of CBAM, one-dimensional convolution is introduced to replace the fully connected layer to aggregate the channels. For the spatial attention module of CBAM, the large convolution kernel is replaced with a dilated convolution to increase the receptive field for aggregation Broader spatial context information. After integrating the model into YOLOv4 and testing it on the VOC2012 data set, mAP is increased by 3.48 percentage points. Experimental results show that the attention model only introduces a small amount of parameters, and the network performance can be greatly improved.

    Reference | Related Articles | Metrics
    Application of Improved YOLOv3 in Foreign Object Debris Target Detection on Airfield Pavement
    GUO Xiaojing, SUI Haoda
    Computer Engineering and Applications    2021, 57 (8): 249-255.   DOI: 10.3778/j.issn.1002-8331.2007-0173
    Abstract214)      PDF(pc) (1320KB)(414)       Save

    Aiming at the detection of small target Foreign Object Debris(FOD) on airfield pavement, a FOD target detection algorithm based on improved YOLOv3 is proposed. Firstly, based on YOLOv3 network, Darknet-49 with lower computational complexity is used as the feature extraction network, and the detection scale of YOLOv3 is increased from 3 to 4 to make full use of the shallow feature information. Secondly, the [K]-means++ algorithm based on Markov Chain Monte Carlo sampling (MCMC) is used to cluster analysis on the labeled bounding box size information of FOD, so as to obtain more reasonable sizes for anchor boxes. Finally, the GIoU loss is introduced as the bounding box regression loss function for training on the FOD dataset. The experimental results show that the precision and recall rate of the improved YOLOv3 target detection algorithm reach 95.3% and 91.1%. Compared with Faster R-CNN, it has a higher detection speed. Compared with SSD, it has a higher detection accuracy. And it effectively solves the problem of missing detection and low positioning accuracy existing in the original YOLOv3.

    Related Articles | Metrics
    Research on Key Technologies of UAV Autonomous Inspection System
    WANG Bo, SONG Dan, WANG Hongyu
    Computer Engineering and Applications    2021, 57 (9): 255-263.   DOI: 10.3778/j.issn.1002-8331.2002-0099
    Abstract290)      PDF(pc) (1251KB)(412)       Save

    Unmanned Aerial Vehicle(UAV) will become the main tool for future power inspections. At present, its flight path is mainly designed based on GPS. Low GPS positioning accuracy and weak GPS signals in some areas are the main bottlenecks preventing the widespread application of UAV power inspection. Aiming at the situation awareness of transmission lines in complex environments, an UAV autonomous inspection system based on monocular vision is proposed to realize autonomous navigation without GPS. The system identifies and detects the target based on deep learning and uses the projection relationship of monocular vision to perform three-dimensional positioning on a detection target with a known size. Then, the drone flight control program based on the DJI SDK is used to adjust the drone’s flight attitude and autonomously navigate to achieve real-time inspection by the drone. The experimental results show that the positioning errors of the system in the three directions of the world coordinate system [X,][Y] and [Z] are 0.31 m, 0.06 m, and 0.24 m, and the processing speed is 0.76 frames per second when it is 10 meters away from the target. The accuracy and feasibility have been verified.

    Related Articles | Metrics
    Improved A* Algorithm and Dynamic Window Method for Robot Dynamic Path Planning
    HUAI Chuangfeng, GUO Long, JIA Xueyan, ZHANG Zihao
    Computer Engineering and Applications    2021, 57 (8): 244-248.   DOI: 10.3778/j.issn.1002-8331.2008-0063
    Abstract217)      PDF(pc) (1504KB)(402)       Save

    In view of the disadvantages of traditional A* algorithm’s own node search strategy, such as many path turning points, large turning angles, and feasible paths that are not theoretically optimal paths, the traditional A* algorithm 3×3 search neighborhood is expanded to 7×7, at the same time the redundant sub-nodes in the same direction in the extended neighborhood are removed and it is improved to the 7×7 A* algorithm, eliminating the traditional A* algorithm’s 3×3 neighborhood search and the restriction that the node moving direction is only an integer multiple of [0.25π], and the search angle is optimized. Secondly, for the problem of dynamic path planning of mobile robots in complex environments, the improved 7×7 A* algorithm and dynamic window algorithm are combined, and a dynamic window evaluation function of the global optimal path is designed, taking into account the moving speed and turning angle. For factors such as smoothness and security, the fusion algorithm of the improved 7×7 A* algorithm and the dynamic window method is compared with a variety of algorithm simulations. The results show that the improved 7×7 A* algorithm and the fusion algorithm of the dynamic window method are better. It is highly efficient and feasible.

    Related Articles | Metrics
    Summary of Dynamic Gesture Recognition Based on Vision
    XIE Yinggang, WANG Quan
    Computer Engineering and Applications    2021, 57 (22): 68-77.   DOI: 10.3778/j.issn.1002-8331.2105-0314
    Abstract266)      PDF(pc) (598KB)(401)       Save

    Gestures have played a very important role in human communication since ancient times, and the visual dynamic gesture identification technology is to use new technologies such as computer vision and IOT(Internet of Things) perception, and 3D visual sensors, allowing the machine to understand human gestures, thus making humanity and machine more good communication, because of far-reaching research significance for human-machine interaction. The sensor techniques used in dynamic gesture identification are introduced, and the technical parameters of the related sensors are compared. By tracking the dynamic gesture recognition technology of vision at home and abroad, the processing process of dynamic gesture recognition is first stated:gesture detection and segmentation, gesture tracking, gesture classification. By comparing the methods involved in each process, it can be seen that deep learning has strong fault tolerance, robustness, high parallelism, anti-interference, etc., which has achieved great achievements above the traditional learning algorithm in the field of gesture identification. Finally, the challenges currently encountering and the future possible development of dynamic gesture identification are analyzed.

    Reference | Related Articles | Metrics
    Semantic Similarity Calculation Based on Transformer Encoder
    QIAO Weitao, HUANG Haiyan, WANG Shan
    Computer Engineering and Applications    2021, 57 (14): 158-163.   DOI: 10.3778/j.issn.1002-8331.2004-0096
    Abstract140)      PDF(pc) (1087KB)(398)       Save

    The calculation of semantic similarity aims to calculate the similarity between texts at the semantic level, which is an important task in natural language processing. Aiming at the problem that the existing calculation methods cannot fully represent the semantic features of sentences, the model TEAM based on the Transformer encoder is proposed. It can extract the semantic information in sentences by using the contextual semantic encoding ability of the Transformer model. In addition, the interactive attention mechanism is introduced. When encoding two sentences the interactive attention mechanism is used to extract similar features between the two sentences, making the model better at capturing important semantic information within the sentence and improving the model’s understanding of semantics and generalization capabilities. The experimental results show that the model can improve the accuracy of the results on the semantic similarity calculation task of English and Chinese, and exhibit better results than existing methods.

    Related Articles | Metrics
    Research Progress of Object Detection Based on Weakly Supervised Learning
    YANG Hui, QUAN Jichuan, LIANG Xinyu, WANG Zhongwei
    Computer Engineering and Applications    2021, 57 (16): 40-49.   DOI: 10.3778/j.issn.1002-8331.2103-0306
    Abstract369)      PDF(pc) (633KB)(385)       Save

    With the continuous development of Convolutional Neural Network(CNN), as the most basic technology in computer vision, object detection has made remarkable progress. Firstly, the current situation that the strong supervised object detection algorithm requires high precision for labeling datasets is introduced. Secondly, the object detection algorithm based on weakly supervised learning is studied. The algorithm is classified into four categories according to different feature processing methods, and the advantages and disadvantages of each algorithm are analyzed and compared. Thirdly, the detection accuracy of all kinds of object detection algorithms based on weakly supervised learning is compared through experiments. At the same time, it is compared with the mainstream strong supervised object detection algorithms. Finally, the future research hotspots of object detection algorithms based on weakly supervised learning are prospected.

    Related Articles | Metrics
    Improved YOLO V2 6D Object Pose Estimation Algorithm
    BAO Zhiqiang, XING Yu, LYU Shaoqing, HUANG Qiongdan
    Computer Engineering and Applications    2021, 57 (9): 148-153.   DOI: 10.3778/j.issn.1002-8331.2001-0367
    Abstract437)      PDF(pc) (1342KB)(363)       Save

    For the 3D pose estimation of the target, combined with the target detection model based on deep learning, 6D target pose estimation algorithm based on improved YOLO V2 is proposed. The feature information of an object in an RGB image is extracted by a convolutional neural network. Based on 2D detection, target position information is mapped to the three-dimensional space. The point-to-point mapping relationship is used to match and calculate target freedom degree in three dimensions. Then, target 6D pose is estimated. The algorithm detects a target in an RGB image. At the same time, target 6D attitude is predicted, which does not require additional post-processing. Experimental results show that the proposed algorithm performs better on other LineMod and Occlusion LineMod datasets than other CNN-based methods recently proposed. The proposed algorithm runs at 37?frames per second on Titan X GPU and can be processed in real time.

    Related Articles | Metrics
    Collaborative Filtering Recommendation for Joint Attention and Autoencoder
    ZHENG Cheng, WANG Jian
    Computer Engineering and Applications    2021, 57 (10): 139-145.   DOI: 10.3778/j.issn.1002-8331.2002-0290
    Abstract155)      PDF(pc) (839KB)(358)       Save

    Facing the huge number of users and items, recommendation systems usually face the problem of sparse data. To alleviate this problem, a collaborative filtering model that combines attention mechanism and autoencoder is proposed. The model sends the rating information to an autoencoder-based collaborative filtering sub-model that is used to mine the user’s overall preferences. At the same time, the rating information is fed into an item-based collaborative filtering sub-model that incorporates the attention mechanism to mine the local dependency information between items. The results in the sub-models are fused to fit the final results. The model is experimentally verified on the MovieLens and Pinterest datasets, and the experimental results are improved compared to the benchmark.

    Related Articles | Metrics
    Review of Deep Neural Network-Based Image Caption
    XU Hao, ZHANG Kai, TIAN Yingjie, CHONG Faguang, WANG Zichao
    Computer Engineering and Applications    2021, 57 (9): 9-22.   DOI: 10.3778/j.issn.1002-8331.2012-0539
    Abstract405)      PDF(pc) (1707KB)(354)       Save

    With the rapid development of deep learning, the quality of image caption is significantly improved. This paper mainly reviews the methods of image caption based on deep neural network and its research status in detail. Image caption algorithm combines the knowledge of computer vision and natural language processing togenerate natural language descriptions based on the content detected in the image automatically, which is an important part of scene understanding. Generally, the basic architecture of image caption task is composed of encoder and decoder. Improving encoders or decoders, applying methods of Generative Adversarial Networks(GAN). Reinforcement Learning(RL), Unsupervised Learning(UL) and Graph Convolution Neural Network(GCN) can effectively improve the performance of image caption algorithm. Afterward, the effect, advantages and disadvantages of each representative model algorithm are analyzed. Moreover, public datasets are introduced. On this basis, the comparative experiments are carried out. Finally, the challenges of image caption and possibility of future work are prospected.

    Related Articles | Metrics
    Remote Sensing Military Target Detection Algorithm Based on Lightweight YOLOv3
    QIN Weiwei, SONG Tainian, LIU Jieyu, WANG Hongwei, LIANG Zhuo
    Computer Engineering and Applications    2021, 57 (21): 263-269.   DOI: 10.3778/j.issn.1002-8331.2106-0026
    Abstract183)      PDF(pc) (14418KB)(353)       Save

    In the process of intelligent missile penetration, detecting enemy anti-missile positions from massive remote sensing image data has great application value. Due to the limited computing power of the missile-borne deployment environment, this paper designs a remote sensing target detection algorithm that takes into account lightweight, detection accuracy and detection speed. A typical remote sensing military target data set is produced, and the data set is clustered and analyzed by the K-means algorithm. The MobileNetV2 network is used to replace the backbone network of the YOLOv3 algorithm to ensure the lightweight and detection speed of the network. A lightweight and efficient channel coordinated attention module and a target rotation invariance detection module suitable for remote sensing target characteristics are proposed, and they are embedded in the detection algorithm to improve the detection accuracy on the basis of network lightweight. Experimental results show that the accuracy rate of the algorithm in this paper reaches 97.8%, an increase of 6.7 percentage points, the recall rate reaches 95.7%, an increase of 3.9 percentage points, the average detection accuracy reaches 95.2%, an increase of 4.4 percentage points, and the detection speed reached 34.19 images per, and the network size is only 17.5?MB. The results show that the algorithm in this paper can meet the comprehensive requirements of intelligent missile penetration.

    Reference | Related Articles | Metrics
    Application of Deep Reinforcement Learning Algorithm on Intelligent Military Decision System
    KUANG Liqun, LI Siyuan, FENG Li, HAN Xie, XU Qingyu
    Computer Engineering and Applications    2021, 57 (20): 271-278.   DOI: 10.3778/j.issn.1002-8331.2104-0114
    Abstract472)      PDF(pc) (1223KB)(349)       Save

    Deep reinforcement learning algorithm can well achieve discrete decision-making behavior, but it is difficult to apply to the highly complex and continuous modern battlefield situations, and the algorithm is difficult to converge in multi-agent environment. To solve these problems, an improved Deep Deterministic Policy Gradient(DDPG) algorithm is proposed, which introduces the experience replay technology based on priority and single training mode to improve the convergence speed of the algorithm; at the same time, an exploration strategy of mixed double noise is designed in the algorithm to realize complex and continuous military decision-making and control behavior. The intelligent military decision simulation platform based on the improved DDPG algorithm is developed by unity3D. The simulation environment of Blue Army Infantry attacking Red Army military base is built to simulate multi-agent combat training. The experimental results show that the algorithm can drive multiple combat agents to complete tactical maneuvers and achieve tactical behaviors, such as bypassing obstacles to reach the dominant area for shooting. The algorithm has faster convergence speed and better stability. It can get higher round rewards, and achieves the purpose of improving the efficiency of intelligent military decision-making.

    Reference | Related Articles | Metrics
    Survey on Zero-Shot Learning
    WANG Zeshen,YANG Yun,XIANG Hongxin, LIU Qing
    Computer Engineering and Applications    2021, 57 (19): 1-17.   DOI: 10.3778/j.issn.1002-8331.2106-0133
    Abstract457)      PDF(pc) (1267KB)(345)       Save

    Although there have been well developed in zero-shot learning since the development of deep learning, in the aspect of the application, zero-shot learning did not have a good system to order it. This paper overviews theoretical systems of zero-shot learning, typical models, application systems, present challenges and future research directions. Firstly, it introduces the theoretical systems from definition of zero-shot learning, essential problems, and commonly used data sets. Secondly, some typical models of zero-shot learning are described in chronological order. Thirdly, it presents the application systems about of zero-shot learning from the three dimensions, such as words, images and videos. Finally, the paper analyzes the challenges and future research directions in zero-shot learning.

    Reference | Related Articles | Metrics
    Research Progress of Image Style Transfer Based on Deep Learning
    CHEN Huaiyuan, ZHANG Guangchi, CHEN Gao, ZHOU Qingfeng
    Computer Engineering and Applications    2021, 57 (11): 37-45.   DOI: 10.3778/j.issn.1002-8331.2101-0019
    Abstract238)      PDF(pc) (1275KB)(341)       Save

    Image style transfer is one of the hot research directions in the field of computer vision. With the rise of deep learning, the field of image style transfer has made a breakthrough. In order to promote the development of image style transfer, the existing research methods of image style transfer based on deep learning are reviewed. Firstly, the image style transfer methods based on deep learning are classified and combed, and the style transfer methods based on convolutional neural network and generative adversarial network are compared and analyzed. Then, the improvements and expansions of image style transfer are introduced. Finally, the current challenges and future research directions in the field of image style transfer are discussed.

    Related Articles | Metrics
    Overview of Image Quality Assessment Method Based on Deep Learning
    CAO Yudong, LIU Haiyan, JIA Xu, LI Xiaohui
    Computer Engineering and Applications    2021, 57 (23): 27-36.   DOI: 10.3778/j.issn.1002-8331.2106-0228
    Abstract261)      PDF(pc) (646KB)(337)       Save

    Image quality evaluation is a measurement of the visual quality of an image or video. The researches on image quality evaluation algorithms in the past 10 years are reviewed. First, the measurement indicators of image quality evaluation algorithm and image quality evaluation datasets are introduced. Then, the different classification of image quality evaluation methods are analyzed, and image quality evaluation algorithms with deep learning technology are focused on, basic model of which is deep convolutional network, deep generative adversarial network and transformer. The performance of algorithms with deep learning is often higher than that of traditional image quality assessment algorithms. Subsequently, the principle of image quality assessment with deep learning is described in detail. A specific no-reference image quality evaluation algorithm based on deep generative adversarial network is introduced, which improves the reliability of simulated reference images through enhanced confrontation learning. Deep learning technology requires massive data support. Data enhancement methods are elaborated to improve the performance of the model. Finally, the future research trend of digital image quality evaluation is summarized.

    Reference | Related Articles | Metrics