Most Download articles

    Published in last 1 year| In last 2 years| In last 3 years| All| Most Downloaded in Recent Month | Most Downloaded in Recent Year|

    Published in last 1 year
    Please wait a minute...
    For Selected: Toggle Thumbnails
    Review of Visual Odometry Methods Based on Deep Learning
    ZHI Henghui, YIN Chenyang, LI Huibin
    Computer Engineering and Applications    2022, 58 (20): 1-15.   DOI: 10.3778/j.issn.1002-8331.2203-0480
    Abstract653)      PDF(pc) (904KB)(370)       Save
    Visual odometry(VO) is a common method to deal with the positioning of mobile devices equipped with vision sensors, and has been widely used in autonomous driving, mobile robots, AR/VR and other fields. Compared with traditional model-based methods, deep learning-based methods can learn efficient and robust feature representations from data without explicit computation, thereby improving their ability to handle challenging scenes such as illumination changes and less textures. In this paper, it first briefly reviews the model-based visual odometry methods, and then focuses on six aspects of deep learning-based visual odometry methods, including supervised learning methods, unsupervised learning methods, model-learning fusion methods, common datasets, evaluation metrics, and comparison of models and deep learning methods. Finally, existing problems and future development trends of deep learning-based visual odometry are discussed.
    Reference | Related Articles | Metrics
    Survey on Image Semantic Segmentation in Dilemma of Few-Shot
    WEI Ting, LI Xinlei, LIU Hui
    Computer Engineering and Applications    2023, 59 (2): 1-11.   DOI: 10.3778/j.issn.1002-8331.2205-0496
    Abstract461)      PDF(pc) (4301KB)(336)       Save
    In recent years, image semantic segmentation has developed rapidly due to the emergence of large-scale datasets. However, in practical applications, it is not easy to obtain large-scale, high-quality images, and image annotation also consumes a lot of manpower and time costs. In order to get rid of the dependence on the number of samples, few-shot semantic segmentation has gradually become a research hotspot. The current few-shot semantic segmentation methods mainly use the idea of meta-learning, which can be divided into three categories:based on the siamese neural network, based on the prototype network and based on the attention mechanism according to different model structures. Based on the current research, this paper introduces the development, advantages and disadvantages of various methods for few-shot semantic segmentation, as well as common datasets and experimental designs. On this basis, the application scenarios and future development directions are summarized.
    Reference | Related Articles | Metrics
    Research Progress in Application of Graph Anomaly Detection in Financial Anti-Fraud
    LIU Hualing, LIU Yaxin, XU Junyi, CHEN Shanghui, QIAO Liang
    Computer Engineering and Applications    2022, 58 (22): 41-53.   DOI: 10.3778/j.issn.1002-8331.2203-0233
    Abstract456)      PDF(pc) (1848KB)(332)       Save
    With the rapid development of digital finance, fraud presents new characteristics such as intellectualization, industrialization and strong concealment. And the limitations of traditional expert rules and machine learning methods are increa-
    singly apparent. Graph anomaly detection technology has a strong ability to deal with associated information, which provides new idea for financial anti-fraud. Firstly, the development and advantages of graph anomaly detection are briefly introduced. Secondly, from the perspectives of individual anti-fraud and group anti-fraud, graph anomaly detection technology is divided into individual fraud detections based on feature, proximity, graph representation learning or community division, and gang fraud detections based on dense subgraph, dense subtensor or deep network structure. The basic idea, advantages, disadvantages, research progress and typical applications of each anomaly detection technology are analyzed and compared. Finally, the common test data sets and evaluation criteria are summarized, and the development prospect and research direction of graph anomaly detection technology in financial anti-fraud are given.
    Reference | Related Articles | Metrics
    Computer Engineering and Applications    2022, 58 (22): 0-0.  
    Abstract177)      PDF(pc) (176505KB)(331)       Save
    Related Articles | Metrics
    Overview of Multi-Agent Path Finding
    LIU Zhifei, CAO Lei, LAI Jun, CHEN Xiliang, CHEN Ying
    Computer Engineering and Applications    2022, 58 (20): 43-64.   DOI: 10.3778/j.issn.1002-8331.2203-0467
    Abstract915)      PDF(pc) (1013KB)(317)       Save
    The multi-agent path finding(MAPF) problem is the fundamental problem of planning paths for multiple agents, where the key constraint is that the agents will be able to follow these paths concurrently without colliding with each other. MAPF is widely used in logistics, military, security and other fields. MAPF algorithm can be divided into the centralized planning algorithm and the distributed execution algorithm when the main research results of MAPF at home and abroad are systematically sorted and classified according to different planning methods. The centralized programming algorithm is not only the most classical but also the most commonly used MAPF algorithm. It is mainly divided into four algorithms based on [A*] search, conflict search, cost growth tree and protocol. The other part of MAPF which is the distributed execution algorithm is based on reinforcement learning. According to different improved techniques, the distributed execution algorithm can be divided into three types:the expert demonstration, the improved communication and the task decomposition. The challenges of existing algorithms are pointed out and the future work is forecasted based on the above classification by comparing the characteristics and applicability of MAPF algorithms and analyzing the advantages and disadvantages of existing algorithms.
    Reference | Related Articles | Metrics
    Cross-Social Network User Matching Based on User Check-in
    DAI Jun, MA Qiang
    Computer Engineering and Applications    2023, 59 (2): 76-84.   DOI: 10.3778/j.issn.1002-8331.2203-0581
    Abstract74)      PDF(pc) (28513KB)(260)       Save
    Cross-social network user matching technology can integrate multi-platform user data to realize more diverse applications. Existing research on social network user matching based on check-in ignores the imbalance of multi-source social network check-in data, which leads to a decrease of matching accuracy under real datasets. Aiming at this problem, this paper proposes a cross-social network user matching method based on user check-in. Firstly, the user check-in data is coarse-grained and filtered through grid clustering algorithm, and the check-in data with strong potential correlation is selected; then the spatiotemporal features are extracted from the check-in data, and the similarity of different attributes is calculated; finally, by optimizing the multi-attribute weight distribution of similarity, comprehensive calculation of user matching score is conducted. Experimental results on multiple datasets demonstrate the effectiveness of the proposed method in the case of unbalanced check-in data.
    Reference | Related Articles | Metrics
    Overview of Smoke and Fire Detection Algorithms Based on Deep Learning
    ZHU Yuhua, SI Yiyi, LI Zhihui
    Computer Engineering and Applications    2022, 58 (23): 1-11.   DOI: 10.3778/j.issn.1002-8331.2206-0154
    Abstract326)      PDF(pc) (782KB)(253)       Save
    Among various disasters, fire is one of the main disasters that most often and universally threaten public safety and social development. With the rapid development of economic construction and the increasing size of cities, the number of major fire hazards has increased dramatically. However, the widely used smoke sensor method of fire detection is vulnerable to factors such as distance, resulting in untimely detection. The introduction of video surveillance systems has provided new ideas to solve this problem. Traditional image processing algorithms based on video are earlier proposed methods, and the recent rapid development of machine vision and image processing technologies has resulted in a series of methods using deep learning techniques to automatically detect fires in video and images, which have very important practical applications in the field of fire safety. In order to comprehensively analyze the improvements and applications related to deep learning methods for fire detection, this paper first briefly introduces the fire detection process based on deep learning, and then focuses on a detailed comparative analysis of deep methods for fire detection in three granularities:classification, detection, and segmentation, and elaborates the relevant improvements taken by each class of algorithms for existing problems. Finally, the problems of fire detection at the present stage are summarized and future research directions are proposed.
    Reference | Related Articles | Metrics
    Survey of Transformer Research in Computer Vision
    LI Xiang, ZHANG Tao, ZHANG Zhe, WEI Hongyang, QIAN Yurong
    Computer Engineering and Applications    2023, 59 (1): 1-14.   DOI: 10.3778/j.issn.1002-8331.2204-0207
    Abstract318)      PDF(pc) (1285KB)(234)       Save
    Transformer is a deep neural network based on self-attention mechanism. In recent years, Transformer-based models have become a hot research direction in the field of computer vision, and their structures are constantly being improved and expanded, such as local attention mechanisms, pyramid structures, and so on. Through the improved vision model based on Transformer structure, the performance optimization and structure improvement are reviewed and summarized respectively. In addition,the advantages and disadvantages of the respective structures of the Transformer and convolutional neural network(CNN) are compared and analyzed,and a new hybrid structure of CNN+Transformer is introduced. Finally,the development of Transformer in computer vision is summarized and prospected.
    Reference | Related Articles | Metrics
    Review of Deep Reinforcement Learning Model Research on Vehicle Routing Problems
    YANG Xiaoxiao, KE Lin, CHEN Zhibin
    Computer Engineering and Applications    2023, 59 (5): 1-13.   DOI: 10.3778/j.issn.1002-8331.2210-0153
    Abstract276)      PDF(pc) (1036KB)(226)       Save
    Vehicle routing problem(VRP) is a classic NP-hard problem, which is widely used in transportation, logistics and other fields. With the scale of problem and dynamic factor increasing, the traditional method of solving the VRP is challenged in computational speed and intelligence. In recent years, with the rapid development of artificial intelligence technology, in particular, the successful application of reinforcement learning in AlphaGo provides a new idea for solving routing problems. In view of this, this paper mainly summarizes the recent literature using deep reinforcement learning to solve VRP and its variants. Firstly, it reviews the relevant principles of DRL to solve VRP and sort out the key steps of DRL-based to solve VRP. Then it systematically classifies and summarizes the pointer network, graph neural network, Transformer and hybrid models four types of solving methods, meanwhile this paper also compares and analyzes the current DRL-based model performance in solving VRP and its variants. Finally, this paper sums up the challenge of DRL-based to solve VRP and future research directions.
    Reference | Related Articles | Metrics
    Review of Cross-Domain Object Detection Algorithms Based on Depth Domain Adaptation
    LIU Hualing, PI Changpeng, ZHAO Chenyu, QIAO Liang
    Computer Engineering and Applications    2023, 59 (8): 1-12.   DOI: 10.3778/j.issn.1002-8331.2210-0063
    Abstract302)      PDF(pc) (583KB)(224)       Save
    In recent years, the object detection algorithm based on deep learning has attracted wide attention due to its high detection performance. It has been successfully applied in many fields such as automatic driving and human-computer interaction and has achieved certain achievements. However, traditional deep learning methods are based on the assumption that the training set (source domain) and the test set (target domain) follow the same distribution, but this assumption is not realistic, which severely reduces the generalization performance of the model. How to align the distribution of the source domain and the target domain so as to improve the generalization of the object detection model has become a research hotspot in the past two years. This article reviews cross-domain object detection algorithms. First, it introduces the preliminary knowledge of cross-domain object detection:depth domain adaptation and object detection. The cross-domain object detection is decomposed into two small areas for an overview, in order to understand its development from the bottom logic. In turn, this article introduces the latest developments in cross-domain object detection algorithms, from the perspectives of differences, confrontation, reconstruction, hybrid and other five categories, and sorts out the research context of each category. Finally, this article summarizes and looks forward to the development trend of cross-domain object detection algorithms.
    Reference | Related Articles | Metrics
    Survey of Camera Pose Estimation Methods Based on Deep Learning
    WANG Jing, JIN Yuchu, GUO Ping, HU Shaoyi
    Computer Engineering and Applications    2023, 59 (7): 1-14.   DOI: 10.3778/j.issn.1002-8331.2209-0280
    Abstract294)      PDF(pc) (702KB)(196)       Save
    Camera pose estimation is a technology to accurately estimate the 6-DOF position and pose of camera in world coordinate system under known environment. It is a key technology in robotics and automatic driving. With the rapid development of deep learning, using deep learning to optimize camera pose estimation algorithm has become one of the current research hotspots. In order to master the current research status and trends of camera pose estimation algorithms, the mainstream algorithms based on deep learning are summarized. Firstly, the traditional camera pose estimation methods based on feature points is briefly introduced. Then, the camera pose estimation method based on deep learning is mainly introduced. According to the different core algorithms, the end-to-end camera pose estimation, scene coordinate regression, camera pose estimation based on retrieval, hierarchical structure, multi-information fusion and cross scenescamera pose estimation are elaborated and analyzed in detail. Finally, this paper summarizes the current research status, points out the challenges in the field of camera pose estimation based on in-depth performance analysis, and prospects the development trend of camera pose estimation.
    Reference | Related Articles | Metrics
    Review of Object Detection Algorithm Improvement in Deep Learning
    YANG Feng, DING Zhitong, XING Mengmeng, DING Bo
    Computer Engineering and Applications    2023, 59 (11): 1-15.   DOI: 10.3778/j.issn.1002-8331.2209-0312
    Abstract237)      PDF(pc) (691KB)(195)       Save
    Object detection is currently a research hotspot in the field of computer vision. With the development of deep learning, object detection algorithms based on deep learning are increasingly applied and their performance is constantly improved. This paper summarizes the latest research progress of object detection methods based on deep learning by summarizing common problems encountered in the process of object detection and corresponding improvement methods. This paper focuses on two types of object detection algorithms based on deep learning. In addition, the latest improvement ideas of target detection algorithms are summarized from the aspects of attention mechanism, lightweight network, multi-scale detection. Finally, in view of the current problems in the field of target detection, the future development trend is prospected. And the feasible solution is put forward in order to provide reference ideas and directions for the follow-up research work in this field.
    Reference | Related Articles | Metrics
    Survey on Deep-Learning-Based Long-Term Object Tracking Algorithms
    LIANG Yitao, HAN Yongbo, LI Lei
    Computer Engineering and Applications    2023, 59 (4): 1-17.   DOI: 10.3778/j.issn.1002-8331.2206-0507
    Abstract237)      PDF(pc) (918KB)(191)       Save
    In the field of visual target tracking, long-term tracking has been paid more and more attention by researchers, because it contains more realistic challenging scenarios, such as occlusion, similar object interference and target disappearance. However, traditional long-term tracking algorithms are inefficient and have been unable to meet the application requirements of tracker performance in fields, such as video surveillance and autonomous driving. Recently, a lot of work has rapidly advanced the development of long-term tracking techniques by introducing deep neural networks. In order to analyze the current situation and future development of deep-learning-based long-term tracking algorithms, firstly, by comparing the long-term and short-term tracking datasets and their evaluation indicators, the requirements and difficulties of long-term tracking tasks are summarized, and the development of long-term tracking datasets and evaluation systems is introduced. Subsequently, based on the design framework of deep-learning-based long-term tracking algorithm, the design ideas of each component of the framework are described in detail. Then, taking the long-term tracking strategy as the starting point, the existing research work is analyzed, and the advantages and disadvantages of different models and their characteristics are summarized. Finally, based on the summary of existing research work, the challenges faced in this field are discussed, and the future research trends are presented.
    Reference | Related Articles | Metrics
    Construction and Application of Discipline Knowledge Graph in Personalized Learning
    ZHAO Yubo, ZHANG Liping, YAN Sheng, HOU Min, GAO Mao
    Computer Engineering and Applications    2023, 59 (10): 1-21.   DOI: 10.3778/j.issn.1002-8331.2209-0345
    Abstract206)      PDF(pc) (929KB)(187)       Save
    The discipline knowledge graph is an important tool to support teaching activities based on big data, artificial intelligence and other technologies. As a kind of discipline knowledge semantic network, it contributes to the development of personalized learning systems and the promotion of new infrastructure for digital education resources. Firstly, this paper outlines the concept and classification of knowledge graph. Secondly, this paper summarizes the concept, characteristics, advantages, connotation and the support for personalized learning of discipline knowledge graph. Nextly, this paper focuses on the sorting of construction process of discipline knowledge graph:discipline ontology construction, discipline knowledge extraction, discipline knowledge fusion and discipline knowledge processing, and it also introduces the application of discipline knowledge graph in personalized learning situations and the challenges. Finally, this paper prospects the future tendency of discipline knowledge graph and personalized learning. It provides the reference and inspiration for the organization of educational resources and the innovative development of personalized learning.
    Reference | Related Articles | Metrics
    Object Detection Algorithms Based on Deep Learning and Transformer
    FU Miaomiao, DENG Miaolei, ZHANG Dexian
    Computer Engineering and Applications    2023, 59 (1): 37-48.   DOI: 10.3778/j.issn.1002-8331.2205-0354
    Abstract304)      PDF(pc) (947KB)(183)       Save
    Object detection is the basis for advanced vision tasks such as object tracking and instance segmentation, and has important applications in real-world scenarios such as intelligent transportation, defect detection, and intelligent security. Existing high-precision detection algorithms are all implemented under the guidance of deep learning, accompanied by Anchor frame technology. However, the shortcomings of the anchor frame itself have a great impact on the performance of the detector. Anchor-free collision detection has become a target detection method in recent years. new research directions in the field. At the same time, the great potential shown by Transformer has opened up a new direction of combining image and Transformer for the field of vision, and Transformer-based target detection has also become a new research hotspot. This paper systematically summarizes the target detection algorithms in the deep learning era, investigates and studies related papers on target detection in the past five years, focuses on in-depth analysis of these algorithms from the perspectives of Anchor-free and Transformer, and introduces the specific application situation of these algorithms in real scenarios and the commonly used datasets in the field of target detection. Finally, based on the current research status, the future research directions of target detection are prospected.
    Reference | Related Articles | Metrics
    Target Detection Algorithm of Remote Sensing Image Based on Improved YOLOv5
    LI Kunya, OU Ou, LIU Guangbin, YU Zefeng, LI Lin
    Computer Engineering and Applications    2023, 59 (9): 207-214.   DOI: 10.3778/j.issn.1002-8331.2209-0119
    Abstract233)      PDF(pc) (665KB)(179)       Save
    Aiming at the problems of low target detection accuracy caused by high background complexity, multiple target sizes and too many small targets in remote sensing images, this paper proposes a target detection algorithm of remote sensing image based on improved YOLOv5. The channel-global attention mechanism(CGAM) is introduced into the backbone network to enhance the feature extraction ability of targets at different scales and to suppress the interference of redundant information. The dense upsampling convolution(DUC) module is introduced to expand the low resolution convolution feature maps, which can effectively enhance the fusion effect of different convolution feature maps. The improved algorithm is applied to the open remote sensing data set RSOD, and the average accuracy AP value of the improved YOLOv5 algorithm reaches 78.5%, which is 3.1?percentage points higher than that of the original algorithm. Experimental results show that the improved algorithm can effectively improve the accuracy of remote sensing image target detection.
    Reference | Related Articles | Metrics
    Research Progress of YOLO Series Target Detection Algorithms
    WANG Linyi, BAI Jing, LI Wenjing, JIANG Jinzhe
    Computer Engineering and Applications    2023, 59 (14): 15-29.   DOI: 10.3778/j.issn.1002-8331.2301-0081
    Abstract172)      PDF(pc) (1009KB)(178)       Save
    The YOLO-based algorithm is one of the hot research directions in target detection. In recent years, with the continuous proposition of YOLO series algorithms and their improved models, the YOLO-based algorithm has achieved excellent results in the field of target detection and has been widely used in various fields in reality. This article first introduces the typical datasets and evaluation index for target detection and reviews the overall YOLO framework and the development of the target detection algorithm of YOLOv1~YOLOv7. Then, models and their performance are summarized across eight improvement directions, such as data augmentation, lightweight network construction, and IOU loss optimization, at the three stages of input, feature extraction, and prediction. Afterwards, the application fields of YOLO algorithm are introduced. Finally, combined with the actual problems of target detection, it summarizes and prospects the development direction of the YOLO-based algorithm.
    Reference | Related Articles | Metrics
    Small Object Detection Algorithm Based on Improved YOLOv5 in UAV Image
    XIE Chunhui, WU Jinming, XU Huaiyu
    Computer Engineering and Applications    2023, 59 (9): 198-206.   DOI: 10.3778/j.issn.1002-8331.2212-0336
    Abstract224)      PDF(pc) (808KB)(176)       Save
    UAV aerial images have many characteristics, such as large-scale changes and complex backgrounds, so it is difficult for the existing detectors to detect small objects in aerial images. Aiming at the problem of mistake detection and omission, a small object detection algorithm model Drone-YOLO is proposed. A new detection branch is added to improve the detection capability at multiple scales, meanwhile the model contains a novel feature pyramid network with multi-level information aggregation, which realizes the fusion of cross-layers information. Then a feature fusion module based on multi-scale channel attention mechanism is designed to improve the focus on small objects. The classification task of the prediction head is decoupled from the regression task, and the loss function is optimized using Alpha-IoU to improve the accuracy of detection. The experimental results of VisDrone dataset show that the Drone-YOLO has improved the AP50 by 4.91?percentage points compared with the YOLOv5, and the inference time is only 16.78?ms. Compared with other mainstream models, it has a better detection effect for small targets, and can effectively complete the task of small target detection in UAV aerial images.
    Reference | Related Articles | Metrics
    Review of Explainable Artificial Intelligence
    ZHAO Yanyu, ZHAO Xiaoyong, WANG Lei, WANG Ningning
    Computer Engineering and Applications    2023, 59 (14): 1-14.   DOI: 10.3778/j.issn.1002-8331.2208-0322
    Abstract162)      PDF(pc) (683KB)(175)       Save
    With the development of machine learning and deep learning, artificial intelligence technology has been gradually applied in various fields. However, one of the biggest drawbacks of adopting AI is its inability to explain the basis for predictions. The black-box nature of the models makes it impossible for humans to truly trust them yet in mission-critical application scenarios such as healthcare, finance, and autonomous driving, thus limiting the grounded application of AI in these areas. Driving the development of explainable artificial intelligence(XAI) has become an important issue for achieving mission-critical applications on the ground. At present, there is still a lack of research reviews on XAI in related fields at home and abroad, as well as a lack of studies focusing on causal explanation methods and the evaluation of explainable methods. Therefore, this study firstly starts from the characteristics of explanatory methods and divides the main explainable methods into three categories:model-independent methods, model-dependent methods, and causal explanation methods from the perspective of explanation types, and summarizes and analyzes them respectively, then summarizes the evaluation of explanation methods, lists the applications of explainable AI, and finally discusses the current problems of explainability and provides an outlook.
    Reference | Related Articles | Metrics
    Overview of Cross-Modal Retrieval Technology
    XU Wenwan, ZHOU Xiaoping, WANG Jia
    Computer Engineering and Applications    2022, 58 (23): 12-23.   DOI: 10.3778/j.issn.1002-8331.2205-0160
    Abstract464)      PDF(pc) (769KB)(171)       Save
    Cross modal retrieval can retrieve the information of other models through one model, which has become a research hot-spot in the era of big data. Researchers based on real value representation and binary representation to reduce the semantic gap of different modal information and compare the similarity effectively, but there will still be the problem of low retrieval efficiency or information loss. At present, how to further improve retrieval efficiency and information utilization is a key challenge for cross modal retrieval research. Firstly, the development status of real value representation and binary representation in cross-modal retrieval is introduced. Secondly, it analyzes and compares five cross modal retrieval methods based on modeling technology and similarity comparison under two presentation technologies:subspace learning, topic statistical model learning, deep learning, traditional hash and deep hash. Then, the latest multi-modal datasets are summarized to provide valuable reference for relevant researchers and engineers. Finally, the challenges of cross modal retrieval are analyzed and the future research directions in this field are pointed out.
    Reference | Related Articles | Metrics
    Survey on Computational Approaches for Drug-Target Interaction Prediction
    ZHANG Ran, WANG Xuezhi, WANG Jiajia, MENG Zhen
    Computer Engineering and Applications    2023, 59 (12): 1-13.   DOI: 10.3778/j.issn.1002-8331.2210-0108
    Abstract189)      PDF(pc) (675KB)(164)       Save
    Drug-target interaction prediction aims to discover potential drugs acting on specific proteins, and plays an important role in drug?repositioning, drug side effect prediction, polypharmacology and drug resistance research. With the advancement of computer processing and the continuous updating of computing algorithms, the computational drug-target interaction prediction has shown the advantages of short time, low cost, high precision and wide range, which has received extensive attention and made remarkable progress. In order to sort out the development history and explore the future research direction, the background and significance of drug-target interaction prediction are firstly introduced in brief. Secondly, the methods are classified into four types:molecular docking-based, drug structure-based, text mining-based and chemogenomic-based methods. A comparative analysis of each method is carried out, and the data requirements and application scenarios for each type of methods are described in detail. Finally, the limitations and challenges of the existing research are discussed, and the future research directions are prospected to provide references for follow-up research.
    Reference | Related Articles | Metrics
    Review of Real-Time Semantic Segmentation Algorithms for Deep Learning
    HE Jiafeng, CHEN Hongwei, LUO Dehan
    Computer Engineering and Applications    2023, 59 (8): 13-27.   DOI: 10.3778/j.issn.1002-8331.2210-0144
    Abstract193)      PDF(pc) (1161KB)(160)       Save
    Semantic segmentation is a technique to segment different objects in a picture from the perspective of pixels and label each pixel in the original picture. However, due to UAV navigation, remote sensing images, medical diagnosis and other application fields, real-time semantic segmentation is needed. Therefore, the real-time semantic segmentation technology based on deep learning has developed rapidly. There are many technologies and models for real-time semantic segmentation. Based on this, on the basis of studying the related literature, the real-time semantic segmentation technology is introduced by semantic segmentation technology, and the advantages of real-time semantic segmentation are briefly described. Then, the important and difficult points of real-time semantic segmentation are discussed. According to the important and difficult points, the existing related technologies and models are expounded, and the advantages and disadvantages of the technologies and models are summarized. Finally, the challenges faced by real-time semantic segmentation are prospected, and the real-time semantic segmentation is summarized, which provides some theoretical references for the follow-up discussion.
    Reference | Related Articles | Metrics
    Survey of Research on Deep Multimodal Representation Learning
    PAN Mengzhu, LI Qianmu, QIU Tian
    Computer Engineering and Applications    2023, 59 (2): 48-64.   DOI: 10.3778/j.issn.1002-8331.2206-0145
    Abstract191)      PDF(pc) (6521KB)(159)       Save
    Although deep learning has been widely used in many fields because of its powerful nonlinear representation capabilities, the structural and semantic gap between multi-source heterogeneous modal data seriously hinders the application of subsequent deep learning models. Many scholars have proposed a large number of representation learning methods to explore the correlation and complementarity between different modalities, and improve the performance of deep learning prediction and generalization. However, the research on multimodal representation learning is still in its infancy, and there are still many scientific problems to be solved. So far, multimodal representation learning still lacks a unified cognition, and the architecture and evaluation metrics of multimodal representation learning research are not fully clear. According to the feature structure, semantic information and representation ability of different modalities, this paper studies and analyzes the progress of deep multimodal representation learning from the perspectives of representation fusion and representation alignment. And the existing research work is systematically summarized and scientifically classified. At the same time, this paper analyzes the basic structure, application scenarios and key issues of representative frameworks and models, analyzes the theoretical basis and latest development of deep multimodal representation learning, and points out the current challenges and future development of multimodal representation learning research, to further promote the development and application of deep multimodal representation learning.
    Reference | Related Articles | Metrics
    Overview of Table Detection and Structure Recognition
    ZHANG Yutong, LI Qiyuan, LIU Shukan
    Computer Engineering and Applications    2022, 58 (22): 1-11.   DOI: 10.3778/j.issn.1002-8331.2206-0337
    Abstract192)      PDF(pc) (859KB)(155)       Save
    In view of the current development of table analysis in document analysis, the recent literature relevant to this field is sorted out, and the two key tasks, table detection and table structure recognition, are studied. For table detection, methods are divided into those based on object detection, graph neural network, generative adversarial network and deformable convolutional network. For table structure recognition, methods include those based on object detection, graph neural network, recurrent neural network, deformable convolutional and dilated convolutional network. The methods and limitations of various models are summarized, and the related tasks and their corresponding datasets are sorted out. The common open-source datasets in table analysis are summarized more widely, and the source, scale, scope of application and file type of each dataset are introduced in detail. The commonly used evaluation metrics in table analysis are listed, and the experimental results of existing models are compared in respect of different experimental datasets. The current development of table analysis is summarized, and the future tendency is discussed.
    Reference | Related Articles | Metrics
    Research Advances on Graph Neural Network Recommendation of Knowledge Graph Enhancement
    WU Guodong, WANG Xueni, LIU Yuliang
    Computer Engineering and Applications    2023, 59 (4): 18-29.   DOI: 10.3778/j.issn.1002-8331.2205-0268
    Abstract208)      PDF(pc) (638KB)(155)       Save
    The existing recommendation methods are mainly based on the users’ historical interaction behavior, and the user and item-related feature information are not fully utilized, resulting in the effect of the recommendation is not ideal. The graph neural network(GNN) recommendation enhanced by knowledge graph(KG) is based on the interaction graph constructed by user and item interaction behavior, and the knowledge graph with the same graph structure is introduced and processed by the graph neural network technology, so as to realize personalized recommendation. In this paper, the research progress of graph neural network recommendation enhanced by existing knowledge graph is discussed. Firstly, on the basis of the discussion of graph neural network recommendation and knowledge graph recommendation, the relevant research results of graph neural network recommendation enhanced by the current knowledge graph are deeply analyzed from the aspects of item knowledge graph and collaborative knowledge graph. Then, the main problems in the graph neural network recommendation research based on the existing knowledge graph enhancement are pointed out from the aspects of large-scale dynamic knowledge graph processing, user preference mining for item attributes, knowledge graph embedding learning problem and so on. Finally, the main research directions of GNN recommendation enhanced by knowledge graph in the future are predicted from the following aspects:GNN recommendation enhanced by knowledge graph in dynamic sequential sequence, GNN recommendation enhanced by knowledge graph in meta-learning, GNN recommendation enhanced by multi-model knowledge graph, GNN cross-domain recommendation enhanced by knowledge graph and so on.
    Reference | Related Articles | Metrics
    Recent Advances on Panoramic Image Quality Assessment Methods
    AI Da, BAI Yansong, YU Kexin, YUAN Hui, LIU Ying
    Computer Engineering and Applications    2022, 58 (24): 1-11.   DOI: 10.3778/j.issn.1002-8331.2206-0157
    Abstract179)      PDF(pc) (952KB)(152)       Save
    With the rapid development of virtual reality(VR) technology, panoramic videos and images have become a new form of media display. Therefore, the quality assessment of panoramic image(PIQA) has great practical significance for improvement of VR technology. The researches of PIQA methods in recent five years are reviewed. The existing objective measures of panoramic image are analyzed, including the improved method based on peak signal-to-noise ratio and the improved method based on structural similarity respectively. The methods of deep learning for panoramic image quality assessment are summarized. The quality assessment methods of panoramic images are studied for two unique distortions, the stitching distortion and the projection distortion. The commonly used public datasets and some self-built datasets for PIQA are summarized. By using the pearson linear correlation coefficient, Spearman rank order correlation coefficient and root mean squared error as the evaluation performance measures, the degree of correlation between objective assessment of PIQA methods and subjective perception of human being is compared. The research trend in the future of PIQA methods is prospected.
    Reference | Related Articles | Metrics
    Research Status of AGV and Machine Integrated Scheduling
    WU Bin, DING Yuchao, ABLA Basri
    Computer Engineering and Applications    2023, 59 (6): 1-12.   DOI: 10.3778/j.issn.1002-8331.2207-0427
    Abstract160)      PDF(pc) (727KB)(151)       Save
    With the wide application of automated guided vehicles(AGV), the cooperation between machines and AGVs in flexible manufacturing system(FMS) is paid more and more attention. The research of AGV and machine integrated scheduling mainly includes machine allocation, process sequencing, AGV allocation of transport tasks and AGV path planning. This problem is a very complex combinatorial optimization problem, which has important academic significance and application value for its research. Based on the characteristics of the problem, the latest research literatures at home and abroad are reviewed from two aspects of model and algorithm. The constraints and optimization objectives of the existing models are classified in detail, and the representative results of the existing algorithms are summarized from five aspects:genetic algorithm, hybrid optimization algorithm and simulation optimization algorithm and so on. On this basis, the shortcomings of existing research are pointed out, and the content and direction of future research are put forward.
    Reference | Related Articles | Metrics
    Review on Application of Deep Learning in Helmet Wearing Detection
    GAO Teng, ZHANG Xianwu, LI Bai
    Computer Engineering and Applications    2023, 59 (6): 13-29.   DOI: 10.3778/j.issn.1002-8331.2207-0434
    Abstract199)      PDF(pc) (832KB)(151)       Save
    Driven by deep learning, many approaches to object detection have made great progress in the field of industrial security, and the study of helmet-wearing detection has gradually become a significant topic in intelligent image recognition. In order to comprehensively analyze the research status of deep learning technology in helmet wearing detection task, and to facilitate follow-up scientific research personnel to carry out research work, this paper analyzes the state-of-the-art helmet-wearing detection algorithms under deep learning conditions proposed by domestic and foreign scholars in recent years and compares their advantages and limitations. This paper is structured in three sections:the establishment and usage of databases, the predominate algorithms for helmet-wearing detection, the current challenges in the field of helmet-wearing detection. The future research direction of helmet wearing detection field is prospected, and the future research focus in this field is put forward.
    Reference | Related Articles | Metrics
    Survey of Transformer-Based Object Detection Algorithms
    LI Jian, DU Jianqiang, ZHU Yanchen, GUO Yongkun
    Computer Engineering and Applications    2023, 59 (10): 48-64.   DOI: 10.3778/j.issn.1002-8331.2211-0133
    Abstract233)      PDF(pc) (875KB)(151)       Save
    Transformer is a kind of deep learning framework with strong modeling and parallel computing capabilities. At present, object detection algorithm based on Transformer has become a hotspot. In order to further explore new ideas and directions, this paper summarizes the existing object detection algorithm based on Transformer as well as a variety of object detection data sets and their application scenarios. This paper describes the correlation algorithms for Transformer based object detection from four aspects, i.e. feature extraction, object estimation, label matching policy and application of algorithm, compares the Transformer algorithm with the object detection algorithm based on convolutional neural network, analyzes the advantages and disadvantages of Transformer in object detection task, and proposes a general framework for Transformer based object detection model. Finally, the prospect of development trend of Transformer in the field of object detection is put forward.
    Reference | Related Articles | Metrics
    Multi-Modal Meteorological Forecasting Based on Transformer
    XIANG Deping, ZHANG Pu, XIANG Shiming, PAN Chunhong
    Computer Engineering and Applications    2023, 59 (10): 94-103.   DOI: 10.3778/j.issn.1002-8331.2208-0486
    Abstract153)      PDF(pc) (977KB)(148)       Save
    Thanks to the rapid development of meteorological observation technology, the meteorological industry has accumulated massive meteorological data, which provides an opportunity to build new data-driven meteorological forecasting methods. Due to the long-term dependence and large-scale spatial correlation hidden in meteorological data, and due to the complex coupling relationship between different modalities, meteorological forecasting with deep learning is still a challenging research topic. This paper presents a deep learning model for meteorological forecasting based on multi-modal fusion, using sequential multi-modal data in same atmospheric pressure levels composed of four classical meteorological elements:temperature, relative humidity, U-component of wind and V-component of wind. Specifically, convolutional network is used to learn features from every modality, and with those features, the gating mechanism is introduced to multi-modal weighted fusion. Secondly, the attention mechanism is introduced, which replaces the traditional attention mechanism with parallel spatial-temporal axial attention, in order to effectively learn long-term dependencies and large-scale spatial associations. Architecturally, the Transformer encoder-decoder structure is employed as the overall framework. Extensive comparative experiments have been conducted on the regional ERA5 reanalysis dataset, demonstrating that the proposed method is effective and superior in the prediction of temperature, relative humidity and wind.
    Reference | Related Articles | Metrics
    Improved YOLOv5 Lightweight Mask Detection Algorithm
    LIU Chonghao, PAN Lihu, YANG Fan, ZHANG Rui
    Computer Engineering and Applications    2023, 59 (7): 232-241.   DOI: 10.3778/j.issn.1002-8331.2209-0013
    Abstract153)      PDF(pc) (906KB)(146)       Save
    In order to improve the detection efficiency of existing mask detection algorithms, and reduce the parameters and model size, an improved lightweight mask detection algorithm YOLOv5-MBF is proposed. Firstly, the GELU activation function replaces the hard-swish activation function of MobileNetV3 deep network, which optimizes the convergence effect of the model, and the improved MobileNetV3 network replaces the YOLOv5s backbone network, which reduces the calculation amount and improves the speed of model detection. Secondly, the feature pyramid structure of BiFPN is added to fuse with different feature layers, which improves the detection accuracy. At the same time, Mosaic and Mixup data enhancement are used in data processing to improve the generalization and robustness of the model. Focal-Loss EIoU is used as the regression loss function, which optimizes the convergence speed of model training and improves the positioning accuracy of mask and face border. Finally, CBAM attention mechanism is added to make the model pay more attention to important features, suppress insignificant features and improve the detection performance. The experimental results show that the average accuracy of the algorithm is 89.5% on the mask-wearing target and the mask-not-wearing target, the model reasoning speed is increased by 43%, the model parameters are reduced by 49%, and the model size is reduced by 48%, which meets the real-time and detection accuracy requirements of mask detection tasks.
    Reference | Related Articles | Metrics
    FS-YOLOv5:Lightweight Infrared Rode Target Detection Method
    HUANG Lei, YANG Yuan, YANG Chengyu, YANG Wei, LI Yaohua
    Computer Engineering and Applications    2023, 59 (9): 215-224.   DOI: 10.3778/j.issn.1002-8331.2210-0487
    Abstract169)      PDF(pc) (815KB)(140)       Save
    In order to solve the problems of traditional target recognition algorithm in complex scene, including low precision, poor real-time performance and difficulty in small target detection, an FS-YOLOv5s lightweight model based on infrared scene is proposed. A new FS-MobileNetV3 network is proposed to extract feature images instead of CSPDarknet backbone network, which is based on YOLOv5s, a one-stage target detection network. Based on the CIOU loss function of the original network, a Power transform is introduced, which is replaced by α-CIoU to improve the detection ability of the network to small targets. Then K-means++ clustering algorithm is applied to the FLIR infrared data set to regenerate the Anchor. DIoU-NMS is used to replace the NMS post-processing method of the original network to improve the detection ability of occluded objects and reduce the missed detection rate of the model. The ablation experiments on the FLIR infrared dataset have verified that the FS-YOLOv5s lightweight algorithm can meet the task of road target detection in infrared scenes. Compared with the original network, the average accuracy of the FS-YOLOv5s model is only reduced by 0.37?percentage points. The size is reduced by 26%, the number of parameters is reduced by 29%, and the detection speed is increased by 11?FPS, which meets the needs of mobile deployment in different scenarios.
    Reference | Related Articles | Metrics
    Experimental Research on Image Recognition of Wire Rope Damage Based on Improved YOLOv5
    WANG Hongyao, HAN Shuang, LI Qinyi
    Computer Engineering and Applications    2023, 59 (17): 99-106.   DOI: 10.3778/j.issn.1002-8331.2210-0505
    Abstract86)      PDF(pc) (3673KB)(140)       Save
    Wire rope plays a very important role in coal mine equipment. In order to find the wire rope damage as early as possible, conduct early warning and fault handling in advance, and protect the safety of personnel under the mine, a method of wire rope damage identification and detection based on depth learning is proposed. The target detection algorithm YOLOv5 is adopted and improved. The fast adaptive weighted median filter is used for image pre-processing to improve the recognition accuracy of wire rope damage images. After the improvement, the running speed is increased to 187?ms/piece, and the enhancement effect is good. It integrates CBAM and Transformer prediction heads(TPH) into YOLOv5, and inputs the expanded dataset into the improved model for training and testing. The experimental results show that the improved model has good detection performance, and the final average accuracy rate reaches 0.893, 0.037 higher than the original algorithm, 0.196, 0.162 and 0.102 higher than the traditional detection algorithm SSD, Faster R-CNN and the original YOLOv3, respectively. It shows that the algorithm in this paper has high accuracy and effectively improves the recognition accuracy of wire rope damage images.
    Reference | Related Articles | Metrics
    Review of Research on Driver Fatigue Driving Detection Methods
    ZHANG Rui, ZHU Tianjun, ZOU Zhiliang, SONG Rui
    Computer Engineering and Applications    2022, 58 (21): 53-66.   DOI: 10.3778/j.issn.1002-8331.2204-0053
    Abstract205)      PDF(pc) (946KB)(139)       Save
    The proportion of traffic accidents caused by fatigue driving has increased year by year, which has attracted widespread attention from researchers. At present, the research of fatigue driving testing is limited by various factors such as scientific and technological level, environment, and road, which makes it difficult to further develop fatigue driving detection technology. This article introduces the latest progress in driver fatigue driving detection methods in the past decade. The two categories of active detection method and passive detection method are elaborated and reviewed. According to the different characteristics of the two major types of detection methods, it is carefully classified. The advantages and limitations of various fatigue driving detection methods are further analyzed, and the detection algorithms used in the active detection method based on facial features in the past three years are analyzed and summarized. Finally, the shortcomings of various fatigue driving detection methods are summarized, and the future research trends in the field of fatigue detection are proposed, which provides new ideas for researchers to further research.
    Reference | Related Articles | Metrics
    Review of Research on Application of Vision Transformer in Medical Image Analysis
    SHI Lei, JI Qingyu, CHEN Qingwei, ZHAO Hengyi, ZHANG Junxing
    Computer Engineering and Applications    2023, 59 (8): 41-55.   DOI: 10.3778/j.issn.1002-8331.2206-0022
    Abstract208)      PDF(pc) (869KB)(138)       Save
    Deep self-attentive network(Transformer) has a natural ability to model global features and long-range correlations of input information, which is strongly complementary to the inductive bias property of convolutional neural networks(CNN). Inspired by its great success in natural language processing, Transformer has been widely introduced into various computer vision tasks, especially medical image analysis, and has achieved remarkable performance. In this paper, it first introduces the typical work of vision Transformer on natural images, and then organizes and summarizes the related work according to different lesions or organs in the subfields of medical image segmentation, medical image classification and medical image registration, focusing on the implementation ideas of some representative work. Finally, current researches are discussed and the future direction is pointed out. The purpose of this paper is to provide a reference for further in-depth research in this field.
    Reference | Related Articles | Metrics
    Review of Recommendation Systems Using Knowledge Graph
    ZHANG Mingxing, ZHANG Xiaoxiong, LIU Shanshan, TIAN Hao, YANG Qinqin
    Computer Engineering and Applications    2023, 59 (4): 30-42.   DOI: 10.3778/j.issn.1002-8331.2209-0033
    Abstract240)      PDF(pc) (702KB)(135)       Save
    With the rapid development of the Internet, how to obtain the needed information from huge amounts of data becomes more important. The recommendation system is a method of screening information, which aims to recommend personalized content for users. However, traditional recommendation algorithms still suffer from several challenges, such as data sparsity and cold start. In recent years, researchers have used the rich entity and relationship information in the knowledge graph to alleviate the above problems. The overall performance of the recommendation system is enhanced. This paper gives a review of the recommendation system based on knowledge graph from three aspects:Firstly, basic concepts of the recommendation system and knowledge graph are introduced. The shortcomings of the existing recommendation algorithms are pointed out. Then, the research of the recommendation system based on knowledge graph is analyzed in detail. The advantages and challenges of the different approaches are assessed. Finally, relevant application scenarios and future development prospects are summarized.
    Reference | Related Articles | Metrics
    Survey of Few-Shot Relation Classification
    LIU Tao, KE Zunwang, Wushour·Silamu
    Computer Engineering and Applications    2023, 59 (9): 1-12.   DOI: 10.3778/j.issn.1002-8331.2208-0027
    Abstract149)      PDF(pc) (687KB)(134)       Save
    Few-shot relation classification aims to mine the semantic relationship between target entities in natural language texts with limited labeled training examples, so as to deal with the resource shortage problem faced by the traditional relation classification methods, so that it can be better applied to medicine, finance and ethnic language processing and other data scarce fields. At present, the relevant research work on few-shot relation classification all learns prior knowledge under the training strategy of meta learning, and to quickly adapt to new tasks. Generally, it can be divided into four classes method:prototype network based, pre-training language model based, parameter optimization based, and graph neural network based. This paper reviews the development of few-shot relation classification, analyzes and summarizes the advantages and limitations of different research methods. On this basis, the paper analyzes the current problems and challenge faced by few-shot relation classification, and prospects the future research directions of this field.
    Reference | Related Articles | Metrics
    Review of Single-Image 3D Face Reconstruction Methods
    WANG Jingting, LI Huibin
    Computer Engineering and Applications    2023, 59 (17): 1-21.   DOI: 10.3778/j.issn.1002-8331.2210-0041
    Abstract96)      PDF(pc) (961KB)(133)       Save
    In recent years, 3D face reconstruction task, as an important part of “digital human” technology, has received great attention from both academia and industry. In particular, 3D face reconstruction task based on a single image has made great progress by fully combining traditional camera model, illumination model, 3D face statistical deformation model with the deep convolutional neural network and deep generative models. This paper focuses on the single-image 3D face reconstruction problem, and divides the existing research works into two categories based on implicit space coding and explicit space regression. The first type of research works optimize the basis coefficient solution and loss function design of the basic 3D face statistical model to improve the reconstruction effect, which has the advantage of robustness in face topology change but lacks detailed features. The second type of research works represent 3D faces in the forms of multiple data in explicit space and regress them directly by deep networks, which can usually obtain more personalized 3D face detail features and have better robustness to interference factors such as illumination and occlusion. Furthermore, based on the commonly used datasets and evaluation metrics, this paper fully explores and compares the advantages and disadvantages of some typical methods of both categories. Finally, it summarizes the whole paper and points out the main challenges and future development trends of the single-image based 3D face reconstruction task.
    Reference | Related Articles | Metrics
    LSTFormer:Lightweight Semantic Segmentation Network Based on Swin Transformer
    YANG Cheng, GAO Jianlin, ZHENG Meilin, DING Rong
    Computer Engineering and Applications    2023, 59 (12): 166-175.   DOI: 10.3778/j.issn.1002-8331.2210-0331
    Abstract175)      PDF(pc) (801KB)(131)       Save
    Aiming at the general problem of high computational complexity in existing semantic segmentation networks based on Transformer, a lightweight semantic segmentation network based on Swin Transformer is proposed. Firstly, feature maps of multiple scales are obtained by Swin Transformer. Secondly, the full perception module and the improved cascading fusion module are used to fuse the feature maps of different scales across layers, reducing the semantic gap between the feature maps of different levels. Then, a single Swin Transformer block is introduced to optimize the initial segmentation feature mapping and improve the ability of the network to classify different pixels through the moving window autoattention mechanism. Finally, Dice loss function and cross-entropy loss function are added in the training stage to improve the segmentation performance and convergence speed of the network. The experimental results show that the mIoU of LSTFormer on ADE20K and Cityscapes reaches 49.47% and 81.47%. Compared with similar networks such as SETR and Swin-UPerNet, LSTFormer has lower parameters and computation while maintaining the same segmentation accuracy.
    Reference | Related Articles | Metrics
    Computer Engineering and Applications    2023, 59 (14): 0-0.  
    Abstract98)      PDF(pc) (702KB)(130)       Save
    Related Articles | Metrics