Most Read articles

    Published in last 1 year |  In last 2 years |  In last 3 years |  All

    Please wait a minute...
    For Selected: Toggle Thumbnails
    KANG Mengxuan, SONG Junping, FAN Pengfei, GAO Bowen, ZHOU Xu, LI Zhuo
    2021, 57 (10): 1-9.   DOI: 10.3778/j.issn.1002-8331.2101-0402

    Precisely predicting the trend of network traffic changes can help operators accurately predict network usage, correctly allocate and efficiently use network resources to meet the growing and diverse user needs. Taking the progress of deep learning algorithms in the field of network traffic prediction as a clue, this paper firstly elaborates the evaluation indicators of network traffic prediction and the current public network traffic data sets. Secondly, this paper specifically analyzes four deep learning methods commonly used in network traffic prediction:deep belief networks, convolutional neural network, recurrent neural network, and long short term memory network, and focuses on the integrated neural network models used in recent years for different problems. The characteristics and application scenarios of each model are summarized. Finally, the future development of network traffic forecast is prospected.


    Nowadays, the rapid development of artificial intelligence technology has promoted the great progress of society. As an important part of artificial intelligence, deep learning has a very broad application prospect. In recent years, more and more experts and scholars have begun to study the related technologies in the field of deep learning. Two typical directions are natural language processing and computer vision. Among them, the development of computer vision strongly leads the progress in the field of deep learning. The application of convolutional neural network and a new neural network model in deep learning?capsule network and its dynamic routing algorithm are introduced, and their advantages and disadvantages are compared. Then, the application of capsule network is reviewed, and the application fields and advantages of capsule network are described in terms of image and text. Finally, the paper summarizes and looks forward to the possible improvement direction of capsule network.

    HAO Chao, QIU Hangping, SUN Yi, ZHANG Chaoran
    2021, 57 (10): 48-56.   DOI: 10.3778/j.issn.1002-8331.2101-0096

    As a basic task in natural language processing, text classification has been studied in the 1950s. Now the single-label text classification algorithm has matured, but there is still a lot of improvement on multi-label text classification. Firstly, the basic concepts and basic processes of multi-label text classification are introduced, including data set acquisition, text preprocessing, model training and prediction results. Secondly, the methods of multi-label text classification are introduced. These methods are mainly divided into two categories:traditional machine learning methods and the methods based on deep learning. Traditional machine learning methods mainly include problem transformation methods and algorithm adaptation methods. The methods based on deep learning use various neural network models to handle multi-label text classification problems. According to the model structure, they are divided into multi-label text classification methods based on CNN structure, RNN structure and Transfomer structure. The data sets commonly used in multi-label text classification are summarized. Finally, the future development trend is summarized and analyzed.

    LIU Di, JIA Jinlu, ZHAO Yuqing, QIAN Yurong
    2021, 57 (7): 1-13.   DOI: 10.3778/j.issn.1002-8331.2011-0341

    Image denoising is a kind of technology that uses the context information of image sequence to remove noise and restore clear image. It is one of the important research contents in the field of computer vision. With the development of machine learning, deep learning has been widely used in the field of image denoising, and has become an effective solution for image denoising. Firstly, the deep learning image denoising method is analyzed. Secondly, the idea of image denoising method is analyzed in detail according to the network structure, and the advantages and disadvantages are summarized. Then, through the experimental results on DND, PolyU and other data sets, the performance of deep learning based image denoising methods is compared and analyzed. Finally, the key issues of image denoising research are summarized, and the future development trend of the research of this field is discussed.

    GUO Yanfen, CUI Zhe, YANG Zhipeng, PENG Jing, HU Jinrong
    2021, 57 (15): 1-8.   DOI: 10.3778/j.issn.1002-8331.2101-0281

    Medical image registration technology has a wide range of application values for lesion detection, clinical diagnosis, surgical planning, and efficacy evaluation. This paper systematically summarizes the registration algorithm based on deep learning, and analyzes the advantages and limitations of various methods from deep iteration, full supervision, weak supervision to unsupervised learning. In general, unsupervised learning has become the mainstream direction of medical image registration research, because it does not rely on golden standards and uses an end-to-end network to save time. Meanwhile, compared with other methods, unsupervised learning can achieve higher accuracy and spends shorter time. However, medical image registration methods based on unsupervised learning also face some research difficulties and challenges in terms of interpretability, cross-modal diversity, and repeatable scalability in the field of medical images, which points out the research direction for achieving more accurate medical image registration methods in the future.


    Hyperspectral Imagery(HSI) classification is an important task of hyperspectral image processing and application. With the development of deep learning, Convolutional Neural Network(CNN) has gradually become an effective solution to the classification problem of HSI. Firstly, the task of HSI classification is summarized, and the existing problems are analyzed. Secondly, CNN and its classification methods based on spectral features, spatial features and spatial-spectral features have been systematically sorted out, and the above classification methods are carried out experimental comparison. Finally, the key issues of HSI classification are summarized and future research directions are discussed.


    Named Entity Recognition(NER), as a classic research topic in the field of natural language processing, is the basic technology of intelligent question answering, knowledge graph and other tasks. Domain Named Entity Recognition(DNER) is the domain-specific NER scheme. Drived by deep learning technology, Chinese DNER has made a breakthrough. Firstly, this paper summarizes the research framework of Chinese DNER, and reviews the existing research results from four aspects:the determination of domain data sources, the establishment of domain entity types and specifications, the annotation of domain data sets, and the evaluation metrics of Chinese DNER. Then, this paper summarizes the current common technology framework of Chinese DNER, introduces the pattern matching method based on dictionaries and rules, statistical machine learning method, deep learning method, multi-party fusion deep learning method, and focuses on the analysis of Chinese DNER method based on word vector representation and deep learning. Finally, the typical application scenarios of Chinese DNER are discussed, and the future development direction is prospected.


    This paper comprehensively analyzes the research of deep learning in the field of natural language processing through a combination of quantitative and qualitative methods. It uses CiteSpace and VOSviewer to draw a knowledge graph of countries, institutions, journal distribution, keywords co-occurrence, co-citation network clustering, and timeline view of deep learning in the field of natural language processing to clarify the research. Through mining important researches in the field, this paper summarizes the research trend, the main problems, development bottlenecks, and gives corresponding solutions and ideas. Finally, suggestions are given on how to track the research of deep learning in the field of natural language processing, and provides references for subsequent research and development in the field.


    Text sentiment analysis is an important branch of natural language processing, which is widely used in public opinion analysis and content recommendation. It is also a hot topic in recent years. According to different methods used, it is divided into sentiment analysis based on emotional dictionary, sentiment analysis based on traditional machine learning, and sentiment analysis based on deep learning. Through comparing these three methods, the research results are analyzed, and the paper summarizes the advantages and disadvantages of different methods, introduces the related data sets and evaluation index, and application scenario, analysis of emotional subtasks is simple summarized. The future research trend and application field of sentiment analysis problem are found. Certain help and guidance are provided for the researchers in the related areas.


    Object detection is an important research task in the field of computer vision. It is widely used in robotics, automatic vehicles, industrial detection and other fields. On the basis of deep learning theory, the development and research status of object detection algorithm are firstly systematically summarized and the characteristics, advantages, disadvantages and real-time performance of the two categories of algorithms are compared. Next to the three kinds of typical targets (non-motor vehicles, motor vehicles and pedestrians) as objects in the traffic scene, the research status and application of object detection algorithm for detecting and identifying objects are discussed and summarized respectively from six aspects in traffic scene:traditional detection method, object detection algorithm, object detection algorithm optimization, 3d object detection, multimodal object detection and re-identification. And the application of focus on the advantages, limitations and applicable scenario of various methods. Finally, the common object detection and traffic scene data sets and evaluation criteria are summarized, the performance of the two categories of algorithms is compared and analyzed, and the development trend of the application of object detection algorithm in traffic scenes is prospected, providing research ideas for intelligent traffic and automatic vehicles.


    Artificial neural networks are increasingly closely related to other subject areas. People solve problems in various fields by exploring and improving the layer structure of artificial neural networks. Based on the analysis of artificial neural networks related literature, this paper summarizes the history of artificial neural network growth and presents relevant principles of artificial neural networks based on the development of neural networks, including multilayer perceptron, back-propagation algorithm, convolutional neural network and recurrent neural network, explains the classic convolutional neural network model in the development of the convolutional neural network and the widely used variant network structure in the recurrent neural network, reviews the application of each artificial neural network algorithm in related fields, summarizes the possible direction of development of the artificial neural network.


    Object detection is an important research direction of computer vision, its purpose is to accurately identify the category and location of a specific target object in a given image. In recent years, the feature learning and transfer learning capabilities of deep convolutional neural networks have made significant progress in target detection algorithm feature extraction, image expression, classification and recognition. This paper introduces the research progress of target detection algorithm based on deep learning, the characteristics of common data sets and the key parameters of performance index evaluation, compares and analyzes the network structure and implementation mode of target detection algorithm formed by two-stage, single-stage and other improved algorithms. Finally, the application progress of the algorithm in the detection of human faces, salient targets, pedestrians, remote sensing images, medical images, and grain insects is described. Combined with the current problems and challenges, the future research directions are analyzed.

    ZHANG Xiaoli, ZHANG Kuixing, JIANG Mei, WEI Benzheng, CONG Jinyu
    2021, 57 (6): 1-9.   DOI: 10.3778/j.issn.1002-8331.2009-0046

    Lymphoma is a kind of malignant tumor originated from the lymphoid hematopoietic system. Accurate diagnosis based on medical image and pathological image is of great value for its clinical treatment. With the development of machine learning and deep learning technology, the use of artificial intelligence to classify lymphoma images has become a research hotspot in the field of medicine. This paper systematically summarizes and analyzes the research progress of lymphoma imaging and pathological image classification technology, and focuses on the image classification methods and research overview based on new technologies such as machine learning, and finally summarizes and prospects the related technologies of lymphoma image classification.

    ZHU Juntao, YAO Guangle, ZHANG Gexiang, LI Jun, YANG Qiang, WANG Sheng, YE Shaoze
    2021, 57 (7): 22-33.   DOI: 10.3778/j.issn.1002-8331.2012-0200

    With the recent vigorous development of deep learning, Deep Neural Networks(DNN) have made exciting breakthrough in large-scale image classification and recognition tasks, but they still face huge challenges in solving few shot learning problems. Few Shot Learning(FSL) is defined as learning a model that can solve practical problems with a small number of supervised samples, which is of great significance in the field of deep learning. This prompts people to systematically combs the recent work of few shot learning of DNN, and divide the solution into four strategies according to the technology they used to solve the small sample learning problem:data augmentation, metric learning, external memory, parameter optimization. According to these strategies, it comprehensively reviews the existing few shot learning methods of DNN, and summarizes the performance of each strategy on relevant benchmarks. Finally, the limitations of the existing technology are emphasized and its future development direction is prospected to provide reference for future research work.


    Attention mechanism is widely used in deep learning tasks because of its excellent effect and plug and play convenience. This paper mainly focuses on convolution neural network, introduces various mainstream methods in the development process of convolution network attention mechanism, extracts and summarizes its core idea and implementation process, realizes each attention mechanism method, and makes comparative experiments and results analysis on the measured data of the same type of emitter equipment. According to the main ideas and experimental results, the research status and future development direction of attention mechanism in convolutional networks are summarized.


    Mobile robots with autonomous navigation ability are more and more widely used in disaster relief, housekeeping and other human life. As a kind of robot vision navigation, monocular vision navigation algorithm has the advantages of low cost and unlimited distance, but it still has scale uncertainty and initialization problems. This review according to the characteristics of the movement of the mobile robot research, from the main obstacle detection, spatial localization, path planning of monocular visual navigation technology for modular analysis, and the key technology of monocular vision navigation algorithms for iteration and development context, analyzes the typical algorithm of each module, different algorithms are compared in terms of speed, accuracy and robustness, the main problems and difficulties of the algorithms are analyzed. Finally, the future development trend of monocular vision navigation technology for mobile robot is predicted based on the human demand for mobile robot capability and the current situation of technology.

    ZENG Chunyan, YAN Kang, WANG Zhifeng, YU Yan, JI Chunmei
    2021, 57 (8): 1-9.   DOI: 10.3778/j.issn.1002-8331.2012-0357

    With the characteristics of data-driven learning, deep learning technology has made great achievements in the fields of natural language processing, image processing, and speech recognition. However, due to the deep learning model featured by deep networks, many parameters, high complexity and other characteristics, the decisions and intermediate processes made by the model are difficult for humans to understand. Therefore, exploring the interpretability of deep learning has become a new topic in the current artificial intelligence field. This review takes the interpretability of deep learning models as the research object and summarizes its progress. Firstly, the main interpretability methods are summarized and analyzed from four aspects:self-explanatory model, model-specific explanation, model-agnostic explanation, and causal interpretability. At the same time, it enumerates the application of interpretability related technologies, and finally discusses the existing problems of current interpretability research to promote the further development of the deep learning interpretability research framework.

    PENG Jing, LUO Haoyu, ZHAO Gansen, LIN Chengchuang, YI Xusheng, CHEN Shaojie
    2021, 57 (3): 44-57.   DOI: 10.3778/j.issn.1002-8331.2010-0335

    Medical image segmentation is an important application area of computer vision in the medical image processing, its goal is to segment the target area from medical images and provide effective help for subsequent diagnosis and treatment of diseases. Since deep learning technology has made great progress in the image processing, medical image segmentation algorithm based on deep learning has gradually become the focus and hotspot of research in this field. This paper gives a description on the tasks and difficulties of medical image segmentation. Then, it details the deep learning-based medical image segmentation algorithm, classifies and summarizes the current representative methods. Moreover, this paper presents the frequently-used algorithm evaluation indicators and datasets in the field of medical image segmentation. The development of medical image segmentation technology is summarized and forecasted.

    SUN Jingyang, CHEN Fengdong, HAN Yueyue, WU Yuwen, GAN Yu, LIU Guodong
    2021, 57 (17): 1-9.   DOI: 10.3778/j.issn.1002-8331.2103-0556

    Image super-resolution reconstruction aims to recover high-resolution and clear images from low-resolution images. This article first explains the idea of typical image super-resolution reconstruction methods, and then reviews typical and latest image super-resolution reconstruction algorithms based on deep learning from the dimensions of up-sampling position and up-sampling method, learning strategy, loss function, etc. It analyzes the latest development status, and looks forward to the future development trend.


    Object detection is one of the most basic problems in the field of computer vision, which has been widely discussed and studied. In recent years, the development of deep convolution neural network has solved the problem of object detection better, and the detection accuracy has been greatly improved, but there are still many challenges in practical applications. Recent research methods are summarized from four aspects according to the current hot research trends in the field of object detection, aiming at different object detection challenges and problems, such as large range of object scale changes, real-time detection problems, weakly supervision detection problems, unbalanced samples, the relationship between different algorithms is  analyzed, the new improved methods, detection process and implementation effect are expounded. The detection accuracy, advantages, disadvantages and application scenarios of different algorithms are compared in detail. Finally, several possible directions for further development are discussed.


    The theoretical research and applications of generative adversarial networks have been continuously successful and have become one of the current hot spots of research in the field of deep learning. This paper provides a systematic review of the theory of generative adversarial networks and their applications in terms of types of models, evaluation criteria and theoretical research progress; analyzing the strengths and weaknesses of generative models with explicit and implicit density-based, respectively; summarizing the evaluation criteria of generative adversarial networks, interpreting the relationship between the criteria, and introduces the research progress of the generative adversarial network in image generation from the application level, that is, through the image conversion, image generation, image restoration, video generation, text generation and image super-resolution applications; analyzing the theoretical research progress of generative adversarial networks from the perspectives of interpretability, controllability, stability and model evaluation methods. Finally, the paper discusses the challenges of studying generative adversarial networks and looks forward to the possible future directions of development.


    Edge computing, as a key technology of the intelligent railway 5G network, it sinks data caching capabilities, traffic forwarding capabilities and application service capabilities to the edge of the network, effectively meets the low latency, large bandwidth, and massive connection requirements of intelligent railways to support intelligent rail transit application. However, due to it changes in physical location, business types and other aspects, and the complex external environment of the railway scene, highly dynamic, and low credibility, the edge nodes of the intelligent railway business are faced with new security challenges. Combined with the current research status of 5G edge computing security, the security threats faced by railway 5G edge computing are analyzed based on the analysis of the four aspects of terminal, edge network, edge node and edge application. On the basis of detailed security requirements and challenges, and standard progress, the research methods and evaluation indicators are summarized that can be applied to railway MEC safety. Combined with the characteristics of railway 5G edge computing, this paper proposes railway MEC end-to-end safety service solutions and the development direction of future intelligent railway MEC security research.


    Scene text detection plays an important role in understanding scenes for machine. With the development of deep learning in recent years, the methods for scene text detection also change rapidly and achieve good results. The deep learning based scene text detection methods and the corresponding advantages and drawbacks are summarized, which are classified into regression based, segmentation based, and mixed detection methods. The public datasets and metric for performance are introduced. the future trends and some research direction of scene text detection are also discussed.


    Remote sensing image classification technology provides important technical support for the application of the domestic remote sensing images in ecological construction, green development, rural revitalization, poverty alleviation and the construction of the Belt and Road, etc, which is of great significance for serving economic and social development, building a beautiful China, and ensuring the safety of people’s livelihood, etc. In recent years, the rapid development of big data technology and artificial intelligence technology has made great progress in the research on classification application of domestic remote sensing image. This paper briefly analyzes the remote sensing image classification technology and the problems in each stage, and summarizes the data of the six main domestic remote sensing satellites. The four classification methods of domestic remote sensing images based on pixel, mixed pixel, object-oriented and deep learning are comprehensively analyzed, and their research progress in classification application is discussed. Through the application in the field of domestic remote sensing image classification, the four classifications are further compared and analyzed in terms of methods.  Finally, it summarizes the existing issues of domestic remote sensing image classification application, and predicts the future directions of domestic remote sensing image application.

    WANG Faming, LI Jianwei, CHEN Sixi
    2021, 57 (10): 26-38.   DOI: 10.3778/j.issn.1002-8331.2102-0039

    The 3D human pose estimation is essentially a classification and regression problem. It mainly estimates the 3D human pose from images. The 3D human pose estimation based on traditional methods and deep learning methods is the mainstream research method in this field. This paper follows the traditional methods to the deep learning methods to systematically introduce the 3D human posture estimation methods in recent years, and basically understands the traditional methods to obtain the many elements of the human posture through the generation and discrimination methods to complete the 3D human posture estimation. The 3D human pose estimation method based on deep learning mainly regresses the human pose information from the image features by constructing a neural network. It can be roughly divided into three categories:based on direct regression methods, based on 2D information methods, and based on hybrid methods. In the end, it summarizes the current research difficulties and challenges, and discusses the research trends.

    WANG Zeshen,YANG Yun,XIANG Hongxin, LIU Qing
    2021, 57 (19): 1-17.   DOI: 10.3778/j.issn.1002-8331.2106-0133

    Although there have been well developed in zero-shot learning since the development of deep learning, in the aspect of the application, zero-shot learning did not have a good system to order it. This paper overviews theoretical systems of zero-shot learning, typical models, application systems, present challenges and future research directions. Firstly, it introduces the theoretical systems from definition of zero-shot learning, essential problems, and commonly used data sets. Secondly, some typical models of zero-shot learning are described in chronological order. Thirdly, it presents the application systems about of zero-shot learning from the three dimensions, such as words, images and videos. Finally, the paper analyzes the challenges and future research directions in zero-shot learning.

    ZHANG Rongxia, WU Changxu, SUN Tongchao, ZHAO Zengshun
    2021, 57 (19): 44-56.   DOI: 10.3778/j.issn.1002-8331.2104-0369

    The purpose of path planning is to allow the robot to avoid obstacles and quickly plan the shortest path during the movement. Having analyzed the advantages and disadvantages of the reinforcement learning based path planning algorithm, the paper derives a typical deep reinforcement learning, Deep Q-learning Network(DQN) algorithm that can perform excellent path planning in a complex dynamic environment. Firstly, the basic principles and limitations of the DQN algorithm are analyzed in depth, and the advantages and disadvantages of various DQN variant algorithms are compared from four aspects:the training algorithm, the neural network structure, the learning mechanism and AC(Actor-Critic) framework. The paper puts forward the current challenges and problems to be solved in the path planning method based on deep reinforcement learning. The future development directions are proposed, which can provide reference for the development of intelligent path planning and autonomous driving.

    XU Hao, ZHANG Kai, TIAN Yingjie, CHONG Faguang, WANG Zichao
    2021, 57 (9): 9-22.   DOI: 10.3778/j.issn.1002-8331.2012-0539

    With the rapid development of deep learning, the quality of image caption is significantly improved. This paper mainly reviews the methods of image caption based on deep neural network and its research status in detail. Image caption algorithm combines the knowledge of computer vision and natural language processing togenerate natural language descriptions based on the content detected in the image automatically, which is an important part of scene understanding. Generally, the basic architecture of image caption task is composed of encoder and decoder. Improving encoders or decoders, applying methods of Generative Adversarial Networks(GAN). Reinforcement Learning(RL), Unsupervised Learning(UL) and Graph Convolution Neural Network(GCN) can effectively improve the performance of image caption algorithm. Afterward, the effect, advantages and disadvantages of each representative model algorithm are analyzed. Moreover, public datasets are introduced. On this basis, the comparative experiments are carried out. Finally, the challenges of image caption and possibility of future work are prospected.

    HE Yujie, DU Fang, SHI Yingjie, SONG Lijuan
    2021, 57 (11): 21-36.   DOI: 10.3778/j.issn.1002-8331.2012-0170

    Named entity recognition is an important basic task in information extraction, machine translation, question answering system and other natural language processing technologies. In recent years, named entity recognition based on deep learning has become a hot topic for researchers. In order to analyze the progress and future development trend of named entity recognition based on deep learning, this paper gives an overview of the current methods of named entity recognition including the methods based on convolutional neural network, cyclic neural network, transformer model and some other methods, and studies and compares the four methods in detail. This paper also introduces the application fields of named entity recognition and the data sets and evaluation methods involved. Finally, the future research directions are prospected.

    WU Wenjie, SONG Wen’ai, GAO Xuemei, YANG Jijiang, WANG Qing, HUANG Liping, LEI Yi
    2021, 57 (9): 1-8.   DOI: 10.3778/j.issn.1002-8331.2012-0558

    Obstructive Sleep Apnea(OSA) is one of the most common respiratory diseases in adults. It is characterized by frequent upper airway collapse during sleep, which seriously affects people’s sleep quality and health. The diagnosis of obstructive sleep apnea syndrome mainly depends on polysomnography, but this method can not meet the current large number of diagnostic needs. With the emergence and development of artificial intelligence, it is assumed that deep learning can effectively assist doctors in the diagnosis of the syndrome. Starting from the clinical diagnosis of obstructive sleep apnea, this paper introduces the advantages of lateral radiographs of craniofacial region as a diagnostic data set, and the status quo of artificial intelligence in the diagnosis of OSA, puts forward the technical route of artificial intelligence in assisting physicians in the diagnosis of OSA, and analyzes the problems and challenges in the current diagnosis system.


    Physiological signals usually cover useful information such as bioelectrical activity, temperature and pressure of the body, monitoring their numerical fluctuations can help to detect or warn the risk of clinical events in advance. Deep models are hierarchical machine learning models containing multi-level nonlinear transformations, which have significant advantages in feature extraction and modeling, and have great application prospects in the field of computer-aided diagnosis. With the advancement of continuous physiological parameter monitoring technology, the utility of deep models in the detection of physiological electrical signal abnormalities has gradually increased and the research focus has expanded to clinical applications. This paper reviews the research progress of depth models in physiological electrical signal abnormality detection. Firstly, the advantages and shortcomings of classical signal abnormality detection methods are analyzes from the perspective of clinical applications, and the current modeling approaches of depth models are described briefly. Then, the modeling principles and latest applications of classical models are summarized from the perspective of discriminative and generative models, while the training architecture and training strategies of deep models are discussed. Finally, this paper summarizes and discusses the three aspects of abnormality detection in clinical applications, the research progress of deep models and the availability of physiological datasets, and provides an outlook on future research.

    WANG Bing, LE Hongxia, LI Wenjing, ZHANG Menghan
    2021, 57 (8): 62-69.   DOI: 10.3778/j.issn.1002-8331.2009-0356

    Aiming at the problem of insufficient feature extraction and low feature utilization in mask wearing detection tasks in the current YOLO lightweight network, a lightweight network algorithm based on improved YOLOv4-tiny is proposed. It increases the Max Module structure to obtain more main features of the target and improves the detection accuracy. A bottom-up multi-scale fusion is proposed, which combines low-level information to enrich the feature level of the network to improve feature utilization. It uses CIoU as the bounding box regression loss function to speed up model convergence. Compared with the original algorithm, in the public data set PASCAL VOC and mask wearing detection tasks, mAP is increased by 4.9 percentage points and 3.3 percentage points, respectively, and the detection rate reaches 74 frame/s and 64 frame/s, respectively, which meets the accuracy and real-time performance of mask wearing detection tasks.


    While the use of the Internet of things technology brings people convenience in life, it also brings many security problems. Therefore, a complete and robust system should be established to protect the security of the Internet of things, so that the objects of the Internet of things can communicate safely and effectively. The detection system has become a key technology to protect the security of the Internet of things. With the continuous development of machine learning and deep learning, researchers have designed a large number of effective intrusion detection systems. This paper reviews these studies. Firstly, the differences between the current Internet of things security and traditional system security are compared. Secondly, the intrusion detection system is classified in detail from the detection technology, data source, architecture and working methods. Thirdly, starting from the data set, the current stage of the Internet of things intrusion detection system based on machine learning is explained. Finally, the future development direction is discussed.

    ZHANG Yao, LU Huanzhang, ZHANG Luping, HU Moufa
    2021, 57 (13): 55-66.   DOI: 10.3778/j.issn.1002-8331.2102-0260

    Visual multi-object tracking is a hot issue in the field of computer vision. However, the uncertainty of the number of targets in the scene, the mutual occlusion between targets, and the difficulties of discrimination between target features has led to slow progress in the real-world application of visual multi-target tracking. In recent years, with the continuous in-depth research of visual intelligent processing, a variety of deep learning visual multi-object tracking algorithms have emerged. Based on the analysis of the challenges and difficulties faced by visual multi-object tracking, the algorithm is divided into Detection-Based Tracking(DBT) and Joint Detection Tracking(JDT) two categories and six sub-categories class, and studied about its advantages and disadvantages. The analysis shows that the DBT algorithm has a simple structure, but the correlation of each sub-step of the algorithm is not high. The JDT algorithm integrates multi-module joint learning and is dominant in multiple tracking evaluation indicators. The feature extraction module is the key to solve the target occlusion in the DBT algorithm with the expense of the speed of the algorithm, and the JDT algorithm is more dependent on the detection module. At present, multi-object tracking is generally developed from DBT-type algorithms to JDT, achieving a balance between algorithm accuracy and speed in stages. The future development direction of the multi-object tracking algorithm in terms of datasets, sub-modules, and specific scenarios is proposed.


    This paper surveys the research and development of machine learning-based Super-Resolution(SR) reconstruction technique of remote sensing images. The machine learning-based remote sensing image SR reconstruction technique can improve the spatial resolution of remote sensing image by learning the mapping relationship between low resolution image and high resolution image, thus contributing to the visual analysis of remote sensing image. Firstly, according to the difference of data expression methods, machine learning-based SR methods of the remote sensing image are divided into two categories, i.e., dictionary learning-based methods and deep learning-based methods. Then, it briefly describes the concrete problems of various methods, their design ideas and principle are analyzed and summarized; next the advantages and disadvantages of various methods and reconstruction indicators are compared and analyzed. Finally, the problems and difficulties of remote sensing image SR are summarized and the future development trend of remote sensing image SR is prospected.

    TANG Renwei, LIU Qihe, TAN Hao
    2021, 57 (19): 32-43.   DOI: 10.3778/j.issn.1002-8331.2105-0296

    Neural Style Transfer(NST) technique is used to simulate different art styles of images and videos, which is a popular topic in computer vision. This paper aims to provide a comprehensive overview of the current progress towards NST. Firstly, the paper reviews the Non-Photorealistic Rendering(NPR) technique and traditional texture transfer. Then, the paper categorizes current major NST methods and gives a detailed description of these methods along with their subsequent improvements. After that, it discusses various applications of NST and presents several evaluation methods which compares different style transfer models both qualitatively and quantitatively. In the end, it summarizes the existing problems and provides some future research directions for NST.


    In order to better predict stock prices and provide reasonable suggestions for stockholders, a hybrid stock prediction model(LSTM-CNN-CBAM) that incorporates attention mechanism based on Long Short and Term Memory(LSTM) network and Convolutional Neural Network(CNN) is proposed. The model uses an end-to-end network structure. LSTM is used to extract the time-series features in the data, and then CNN is used to mine the deep features in the data. By adding an attention mechanism to the network structure Convolutional Attention Block Module convolution module, which can effectively improve the feature extraction capability of the network. Based on the Shanghai Stock Exchange Index, a comparative experiment is performed. By comparing the experimental prediction results and evaluation indicators, the prediction effectiveness and feasibility of adding the CBAM module to the network model combining LSTM and CNN are verified.

    RAN Rong, XU Xinghua, QIU Shaohua, CUI Xiaopeng, OUYANG Bin
    2021, 57 (9): 23-35.   DOI: 10.3778/j.issn.1002-8331.2012-0500

    Crack is one of the most important factors threatening the safety of civil infrastructure, timely and accurate surface crack detection can effectively avoid possible accidents. Due to the advantages of simple operation, fast detection speed and high accuracy, Deep Convolutional Neural Networks(DCNN) based crack detection methods are now widely used in the structural monitoring fields of bridges, roads monitoring, building constructions and railway transportation etc. This paper summarizes the existing major crack detection methods and reviews DCNN-based crack detection methods in three ways:classification based, object detection based and segmentation based methods. Their principles, advantages and disadvantages, practical application are also analyzed. This paper introduces the commonly-used datasets in crack detection, and discusses the potential problems and future development of DCNN-based crack detection methods.

    DU Zhuoqun, HU Xiaoguang, YANG Shixin, LI Xiaoxiao, WANG Ziqiang, CAI Nengbin
    2021, 57 (14): 1-14.   DOI: 10.3778/j.issn.1002-8331.2103-0197

    With the continuous development of computer vision technology, pedestrian re-identification technology has played a huge role in the fields of security, detection and intelligent surveillance, and has become a current research hotspot. The traditional pedestrian re-recognition technology focuses on the research of the visual information of the RGB image collected by the camera, and has achieved good results under laboratory conditions, but under adverse conditions such as poor lighting, occlusion of objects, and blurred image quality, the recognition rate of the algorithm has experienced a cliff-like decline. Nowadays, visual information does not only focus on RGB images, but also introduces information such as infrared images, depth images, and sketch portraits to improve the recognition rate of the algorithm. At the same time, the application of text information and spatiotemporal information also improves the performance of pedestrian re-recognition algorithms. However, due to the natural differences between the various modes, how to connect multiple kinds of information has become the main problem of multi-source information pedestrian re-identification research. This article combs the research papers on pedestrian re-identification with multiple sources of information published in recent years, expounds the current situation, technical difficulties and future development trends of pedestrian re-identification.


    The sea-surface environment is obscured to meteorological factors such as fog and the contrast of sea-surface images collected is reduced with more noise information interference, which makes it difficult to obtain the completed and accurate significance region when extracting the target significance. To solve the above problems, an improved algorithm is proposed for detecting sea-surface significance object in Deeplabv3 network. More feature information is extracted by using empty convolution and introducing global attention module. Context information of different void rates is connected by fusing the characteristic matrices. Then, the constraint term is added to the binary cross entropy loss function to constrain the significance of cloud occlusion. The model is saved after the training of the large data set and the training of the sea surface cloud shielding data set. Experimental results show that the significance region obtained by the method in this paper can describe the target region completely and the significance region changes undetermined when it is disturbed by the proposed method can describe traget region. The average F-measure value of the proposed method is 22.12%, 15.83% and 13.30% higher than that of other comparison algorithms when the occlusion degree is 30, 50 and 70.