Most Read articles

    Published in last 1 year |  In last 2 years |  In last 3 years |  All

    Published in last 1 year
    Please wait a minute...
    For Selected: Toggle Thumbnails
    Overview of Multi-Agent Path Finding
    LIU Zhifei, CAO Lei, LAI Jun, CHEN Xiliang, CHEN Ying
    Computer Engineering and Applications    2022, 58 (20): 43-64.   DOI: 10.3778/j.issn.1002-8331.2203-0467
    Abstract915)      PDF(pc) (1013KB)(317)       Save
    The multi-agent path finding(MAPF) problem is the fundamental problem of planning paths for multiple agents, where the key constraint is that the agents will be able to follow these paths concurrently without colliding with each other. MAPF is widely used in logistics, military, security and other fields. MAPF algorithm can be divided into the centralized planning algorithm and the distributed execution algorithm when the main research results of MAPF at home and abroad are systematically sorted and classified according to different planning methods. The centralized programming algorithm is not only the most classical but also the most commonly used MAPF algorithm. It is mainly divided into four algorithms based on [A*] search, conflict search, cost growth tree and protocol. The other part of MAPF which is the distributed execution algorithm is based on reinforcement learning. According to different improved techniques, the distributed execution algorithm can be divided into three types:the expert demonstration, the improved communication and the task decomposition. The challenges of existing algorithms are pointed out and the future work is forecasted based on the above classification by comparing the characteristics and applicability of MAPF algorithms and analyzing the advantages and disadvantages of existing algorithms.
    Reference | Related Articles | Metrics
    Review of Visual Odometry Methods Based on Deep Learning
    ZHI Henghui, YIN Chenyang, LI Huibin
    Computer Engineering and Applications    2022, 58 (20): 1-15.   DOI: 10.3778/j.issn.1002-8331.2203-0480
    Abstract653)      PDF(pc) (904KB)(370)       Save
    Visual odometry(VO) is a common method to deal with the positioning of mobile devices equipped with vision sensors, and has been widely used in autonomous driving, mobile robots, AR/VR and other fields. Compared with traditional model-based methods, deep learning-based methods can learn efficient and robust feature representations from data without explicit computation, thereby improving their ability to handle challenging scenes such as illumination changes and less textures. In this paper, it first briefly reviews the model-based visual odometry methods, and then focuses on six aspects of deep learning-based visual odometry methods, including supervised learning methods, unsupervised learning methods, model-learning fusion methods, common datasets, evaluation metrics, and comparison of models and deep learning methods. Finally, existing problems and future development trends of deep learning-based visual odometry are discussed.
    Reference | Related Articles | Metrics
    Overview of Cross-Modal Retrieval Technology
    XU Wenwan, ZHOU Xiaoping, WANG Jia
    Computer Engineering and Applications    2022, 58 (23): 12-23.   DOI: 10.3778/j.issn.1002-8331.2205-0160
    Abstract464)      PDF(pc) (769KB)(171)       Save
    Cross modal retrieval can retrieve the information of other models through one model, which has become a research hot-spot in the era of big data. Researchers based on real value representation and binary representation to reduce the semantic gap of different modal information and compare the similarity effectively, but there will still be the problem of low retrieval efficiency or information loss. At present, how to further improve retrieval efficiency and information utilization is a key challenge for cross modal retrieval research. Firstly, the development status of real value representation and binary representation in cross-modal retrieval is introduced. Secondly, it analyzes and compares five cross modal retrieval methods based on modeling technology and similarity comparison under two presentation technologies:subspace learning, topic statistical model learning, deep learning, traditional hash and deep hash. Then, the latest multi-modal datasets are summarized to provide valuable reference for relevant researchers and engineers. Finally, the challenges of cross modal retrieval are analyzed and the future research directions in this field are pointed out.
    Reference | Related Articles | Metrics
    Survey on Image Semantic Segmentation in Dilemma of Few-Shot
    WEI Ting, LI Xinlei, LIU Hui
    Computer Engineering and Applications    2023, 59 (2): 1-11.   DOI: 10.3778/j.issn.1002-8331.2205-0496
    Abstract461)      PDF(pc) (4301KB)(336)       Save
    In recent years, image semantic segmentation has developed rapidly due to the emergence of large-scale datasets. However, in practical applications, it is not easy to obtain large-scale, high-quality images, and image annotation also consumes a lot of manpower and time costs. In order to get rid of the dependence on the number of samples, few-shot semantic segmentation has gradually become a research hotspot. The current few-shot semantic segmentation methods mainly use the idea of meta-learning, which can be divided into three categories:based on the siamese neural network, based on the prototype network and based on the attention mechanism according to different model structures. Based on the current research, this paper introduces the development, advantages and disadvantages of various methods for few-shot semantic segmentation, as well as common datasets and experimental designs. On this basis, the application scenarios and future development directions are summarized.
    Reference | Related Articles | Metrics
    Research Progress in Application of Graph Anomaly Detection in Financial Anti-Fraud
    LIU Hualing, LIU Yaxin, XU Junyi, CHEN Shanghui, QIAO Liang
    Computer Engineering and Applications    2022, 58 (22): 41-53.   DOI: 10.3778/j.issn.1002-8331.2203-0233
    Abstract456)      PDF(pc) (1848KB)(332)       Save
    With the rapid development of digital finance, fraud presents new characteristics such as intellectualization, industrialization and strong concealment. And the limitations of traditional expert rules and machine learning methods are increa-
    singly apparent. Graph anomaly detection technology has a strong ability to deal with associated information, which provides new idea for financial anti-fraud. Firstly, the development and advantages of graph anomaly detection are briefly introduced. Secondly, from the perspectives of individual anti-fraud and group anti-fraud, graph anomaly detection technology is divided into individual fraud detections based on feature, proximity, graph representation learning or community division, and gang fraud detections based on dense subgraph, dense subtensor or deep network structure. The basic idea, advantages, disadvantages, research progress and typical applications of each anomaly detection technology are analyzed and compared. Finally, the common test data sets and evaluation criteria are summarized, and the development prospect and research direction of graph anomaly detection technology in financial anti-fraud are given.
    Reference | Related Articles | Metrics
    Overview of Smoke and Fire Detection Algorithms Based on Deep Learning
    ZHU Yuhua, SI Yiyi, LI Zhihui
    Computer Engineering and Applications    2022, 58 (23): 1-11.   DOI: 10.3778/j.issn.1002-8331.2206-0154
    Abstract326)      PDF(pc) (782KB)(253)       Save
    Among various disasters, fire is one of the main disasters that most often and universally threaten public safety and social development. With the rapid development of economic construction and the increasing size of cities, the number of major fire hazards has increased dramatically. However, the widely used smoke sensor method of fire detection is vulnerable to factors such as distance, resulting in untimely detection. The introduction of video surveillance systems has provided new ideas to solve this problem. Traditional image processing algorithms based on video are earlier proposed methods, and the recent rapid development of machine vision and image processing technologies has resulted in a series of methods using deep learning techniques to automatically detect fires in video and images, which have very important practical applications in the field of fire safety. In order to comprehensively analyze the improvements and applications related to deep learning methods for fire detection, this paper first briefly introduces the fire detection process based on deep learning, and then focuses on a detailed comparative analysis of deep methods for fire detection in three granularities:classification, detection, and segmentation, and elaborates the relevant improvements taken by each class of algorithms for existing problems. Finally, the problems of fire detection at the present stage are summarized and future research directions are proposed.
    Reference | Related Articles | Metrics
    Survey of Transformer Research in Computer Vision
    LI Xiang, ZHANG Tao, ZHANG Zhe, WEI Hongyang, QIAN Yurong
    Computer Engineering and Applications    2023, 59 (1): 1-14.   DOI: 10.3778/j.issn.1002-8331.2204-0207
    Abstract318)      PDF(pc) (1285KB)(234)       Save
    Transformer is a deep neural network based on self-attention mechanism. In recent years, Transformer-based models have become a hot research direction in the field of computer vision, and their structures are constantly being improved and expanded, such as local attention mechanisms, pyramid structures, and so on. Through the improved vision model based on Transformer structure, the performance optimization and structure improvement are reviewed and summarized respectively. In addition,the advantages and disadvantages of the respective structures of the Transformer and convolutional neural network(CNN) are compared and analyzed,and a new hybrid structure of CNN+Transformer is introduced. Finally,the development of Transformer in computer vision is summarized and prospected.
    Reference | Related Articles | Metrics
    Object Detection Algorithms Based on Deep Learning and Transformer
    FU Miaomiao, DENG Miaolei, ZHANG Dexian
    Computer Engineering and Applications    2023, 59 (1): 37-48.   DOI: 10.3778/j.issn.1002-8331.2205-0354
    Abstract304)      PDF(pc) (947KB)(183)       Save
    Object detection is the basis for advanced vision tasks such as object tracking and instance segmentation, and has important applications in real-world scenarios such as intelligent transportation, defect detection, and intelligent security. Existing high-precision detection algorithms are all implemented under the guidance of deep learning, accompanied by Anchor frame technology. However, the shortcomings of the anchor frame itself have a great impact on the performance of the detector. Anchor-free collision detection has become a target detection method in recent years. new research directions in the field. At the same time, the great potential shown by Transformer has opened up a new direction of combining image and Transformer for the field of vision, and Transformer-based target detection has also become a new research hotspot. This paper systematically summarizes the target detection algorithms in the deep learning era, investigates and studies related papers on target detection in the past five years, focuses on in-depth analysis of these algorithms from the perspectives of Anchor-free and Transformer, and introduces the specific application situation of these algorithms in real scenarios and the commonly used datasets in the field of target detection. Finally, based on the current research status, the future research directions of target detection are prospected.
    Reference | Related Articles | Metrics
    Review of Cross-Domain Object Detection Algorithms Based on Depth Domain Adaptation
    LIU Hualing, PI Changpeng, ZHAO Chenyu, QIAO Liang
    Computer Engineering and Applications    2023, 59 (8): 1-12.   DOI: 10.3778/j.issn.1002-8331.2210-0063
    Abstract302)      PDF(pc) (583KB)(224)       Save
    In recent years, the object detection algorithm based on deep learning has attracted wide attention due to its high detection performance. It has been successfully applied in many fields such as automatic driving and human-computer interaction and has achieved certain achievements. However, traditional deep learning methods are based on the assumption that the training set (source domain) and the test set (target domain) follow the same distribution, but this assumption is not realistic, which severely reduces the generalization performance of the model. How to align the distribution of the source domain and the target domain so as to improve the generalization of the object detection model has become a research hotspot in the past two years. This article reviews cross-domain object detection algorithms. First, it introduces the preliminary knowledge of cross-domain object detection:depth domain adaptation and object detection. The cross-domain object detection is decomposed into two small areas for an overview, in order to understand its development from the bottom logic. In turn, this article introduces the latest developments in cross-domain object detection algorithms, from the perspectives of differences, confrontation, reconstruction, hybrid and other five categories, and sorts out the research context of each category. Finally, this article summarizes and looks forward to the development trend of cross-domain object detection algorithms.
    Reference | Related Articles | Metrics
    Survey of Camera Pose Estimation Methods Based on Deep Learning
    WANG Jing, JIN Yuchu, GUO Ping, HU Shaoyi
    Computer Engineering and Applications    2023, 59 (7): 1-14.   DOI: 10.3778/j.issn.1002-8331.2209-0280
    Abstract294)      PDF(pc) (702KB)(196)       Save
    Camera pose estimation is a technology to accurately estimate the 6-DOF position and pose of camera in world coordinate system under known environment. It is a key technology in robotics and automatic driving. With the rapid development of deep learning, using deep learning to optimize camera pose estimation algorithm has become one of the current research hotspots. In order to master the current research status and trends of camera pose estimation algorithms, the mainstream algorithms based on deep learning are summarized. Firstly, the traditional camera pose estimation methods based on feature points is briefly introduced. Then, the camera pose estimation method based on deep learning is mainly introduced. According to the different core algorithms, the end-to-end camera pose estimation, scene coordinate regression, camera pose estimation based on retrieval, hierarchical structure, multi-information fusion and cross scenescamera pose estimation are elaborated and analyzed in detail. Finally, this paper summarizes the current research status, points out the challenges in the field of camera pose estimation based on in-depth performance analysis, and prospects the development trend of camera pose estimation.
    Reference | Related Articles | Metrics
    Review of Deep Reinforcement Learning Model Research on Vehicle Routing Problems
    YANG Xiaoxiao, KE Lin, CHEN Zhibin
    Computer Engineering and Applications    2023, 59 (5): 1-13.   DOI: 10.3778/j.issn.1002-8331.2210-0153
    Abstract276)      PDF(pc) (1036KB)(226)       Save
    Vehicle routing problem(VRP) is a classic NP-hard problem, which is widely used in transportation, logistics and other fields. With the scale of problem and dynamic factor increasing, the traditional method of solving the VRP is challenged in computational speed and intelligence. In recent years, with the rapid development of artificial intelligence technology, in particular, the successful application of reinforcement learning in AlphaGo provides a new idea for solving routing problems. In view of this, this paper mainly summarizes the recent literature using deep reinforcement learning to solve VRP and its variants. Firstly, it reviews the relevant principles of DRL to solve VRP and sort out the key steps of DRL-based to solve VRP. Then it systematically classifies and summarizes the pointer network, graph neural network, Transformer and hybrid models four types of solving methods, meanwhile this paper also compares and analyzes the current DRL-based model performance in solving VRP and its variants. Finally, this paper sums up the challenge of DRL-based to solve VRP and future research directions.
    Reference | Related Articles | Metrics
    Review of Recommendation Systems Using Knowledge Graph
    ZHANG Mingxing, ZHANG Xiaoxiong, LIU Shanshan, TIAN Hao, YANG Qinqin
    Computer Engineering and Applications    2023, 59 (4): 30-42.   DOI: 10.3778/j.issn.1002-8331.2209-0033
    Abstract240)      PDF(pc) (702KB)(135)       Save
    With the rapid development of the Internet, how to obtain the needed information from huge amounts of data becomes more important. The recommendation system is a method of screening information, which aims to recommend personalized content for users. However, traditional recommendation algorithms still suffer from several challenges, such as data sparsity and cold start. In recent years, researchers have used the rich entity and relationship information in the knowledge graph to alleviate the above problems. The overall performance of the recommendation system is enhanced. This paper gives a review of the recommendation system based on knowledge graph from three aspects:Firstly, basic concepts of the recommendation system and knowledge graph are introduced. The shortcomings of the existing recommendation algorithms are pointed out. Then, the research of the recommendation system based on knowledge graph is analyzed in detail. The advantages and challenges of the different approaches are assessed. Finally, relevant application scenarios and future development prospects are summarized.
    Reference | Related Articles | Metrics
    Review of Object Detection Algorithm Improvement in Deep Learning
    YANG Feng, DING Zhitong, XING Mengmeng, DING Bo
    Computer Engineering and Applications    2023, 59 (11): 1-15.   DOI: 10.3778/j.issn.1002-8331.2209-0312
    Abstract237)      PDF(pc) (691KB)(195)       Save
    Object detection is currently a research hotspot in the field of computer vision. With the development of deep learning, object detection algorithms based on deep learning are increasingly applied and their performance is constantly improved. This paper summarizes the latest research progress of object detection methods based on deep learning by summarizing common problems encountered in the process of object detection and corresponding improvement methods. This paper focuses on two types of object detection algorithms based on deep learning. In addition, the latest improvement ideas of target detection algorithms are summarized from the aspects of attention mechanism, lightweight network, multi-scale detection. Finally, in view of the current problems in the field of target detection, the future development trend is prospected. And the feasible solution is put forward in order to provide reference ideas and directions for the follow-up research work in this field.
    Reference | Related Articles | Metrics
    Survey on Deep-Learning-Based Long-Term Object Tracking Algorithms
    LIANG Yitao, HAN Yongbo, LI Lei
    Computer Engineering and Applications    2023, 59 (4): 1-17.   DOI: 10.3778/j.issn.1002-8331.2206-0507
    Abstract237)      PDF(pc) (918KB)(191)       Save
    In the field of visual target tracking, long-term tracking has been paid more and more attention by researchers, because it contains more realistic challenging scenarios, such as occlusion, similar object interference and target disappearance. However, traditional long-term tracking algorithms are inefficient and have been unable to meet the application requirements of tracker performance in fields, such as video surveillance and autonomous driving. Recently, a lot of work has rapidly advanced the development of long-term tracking techniques by introducing deep neural networks. In order to analyze the current situation and future development of deep-learning-based long-term tracking algorithms, firstly, by comparing the long-term and short-term tracking datasets and their evaluation indicators, the requirements and difficulties of long-term tracking tasks are summarized, and the development of long-term tracking datasets and evaluation systems is introduced. Subsequently, based on the design framework of deep-learning-based long-term tracking algorithm, the design ideas of each component of the framework are described in detail. Then, taking the long-term tracking strategy as the starting point, the existing research work is analyzed, and the advantages and disadvantages of different models and their characteristics are summarized. Finally, based on the summary of existing research work, the challenges faced in this field are discussed, and the future research trends are presented.
    Reference | Related Articles | Metrics
    Target Detection Algorithm of Remote Sensing Image Based on Improved YOLOv5
    LI Kunya, OU Ou, LIU Guangbin, YU Zefeng, LI Lin
    Computer Engineering and Applications    2023, 59 (9): 207-214.   DOI: 10.3778/j.issn.1002-8331.2209-0119
    Abstract233)      PDF(pc) (665KB)(179)       Save
    Aiming at the problems of low target detection accuracy caused by high background complexity, multiple target sizes and too many small targets in remote sensing images, this paper proposes a target detection algorithm of remote sensing image based on improved YOLOv5. The channel-global attention mechanism(CGAM) is introduced into the backbone network to enhance the feature extraction ability of targets at different scales and to suppress the interference of redundant information. The dense upsampling convolution(DUC) module is introduced to expand the low resolution convolution feature maps, which can effectively enhance the fusion effect of different convolution feature maps. The improved algorithm is applied to the open remote sensing data set RSOD, and the average accuracy AP value of the improved YOLOv5 algorithm reaches 78.5%, which is 3.1?percentage points higher than that of the original algorithm. Experimental results show that the improved algorithm can effectively improve the accuracy of remote sensing image target detection.
    Reference | Related Articles | Metrics
    Survey of Transformer-Based Object Detection Algorithms
    LI Jian, DU Jianqiang, ZHU Yanchen, GUO Yongkun
    Computer Engineering and Applications    2023, 59 (10): 48-64.   DOI: 10.3778/j.issn.1002-8331.2211-0133
    Abstract233)      PDF(pc) (875KB)(151)       Save
    Transformer is a kind of deep learning framework with strong modeling and parallel computing capabilities. At present, object detection algorithm based on Transformer has become a hotspot. In order to further explore new ideas and directions, this paper summarizes the existing object detection algorithm based on Transformer as well as a variety of object detection data sets and their application scenarios. This paper describes the correlation algorithms for Transformer based object detection from four aspects, i.e. feature extraction, object estimation, label matching policy and application of algorithm, compares the Transformer algorithm with the object detection algorithm based on convolutional neural network, analyzes the advantages and disadvantages of Transformer in object detection task, and proposes a general framework for Transformer based object detection model. Finally, the prospect of development trend of Transformer in the field of object detection is put forward.
    Reference | Related Articles | Metrics
    Review of Research on Computing-Intensive Task Scheduling in Edge Environments
    LIU Yanpei, ZHU Yunjing, BIN Yanru, CHEN Ningning, WANG Liping
    Computer Engineering and Applications    2022, 58 (20): 28-42.   DOI: 10.3778/j.issn.1002-8331.2202-0243
    Abstract228)      PDF(pc) (1187KB)(78)       Save
    With the dramatic increase in the number of mobile devices and the widespread use of computing-intensive applications such as face recognition, internet of vehicles and virtual reality. In order to achieve the optimal matching of tasks and collaborative resources to meet user QoS requests, using a task scheduling scheme for reasonably computing-intensive applications can solve the problems of extended time, high cost, unbalanced load and low resource utilization in the edge cloud center. Firstly, the scheduling framework, execution process, application scenarios and performance indicators of computing-intensive application tasks in the edge computing environment are described. Secondly, this paper analyzes and compares three task scheduling schemes from the optimization goals of time and cost, energy consumption and resource utilization, load balancing and throughput, and summarizes the advantages, disadvantages and applicable scenarios of these schemes. Then, by analyzing the SDN-based edge computing architecture in the 5G environment, the task scheduling strategy for edge computing-intensive data packet based on SDN, task scheduling strategy for computing-intensive application based on deep reinforcement learning, multi-objective cross-layer task scheduling strategy in 5G IoV network are proposed. At last, the challenges of task scheduling in edge computing environment are summarized from the aspects of fault-tolerant scheduling, dynamic microservice scheduling, crowd aware scheduling, security and privacy.
    Reference | Related Articles | Metrics
    Graph Neural Network and Its Research Progress in Field of Image Processing
    JIANG Yuying, CHEN Xinyu, LI Guangming, WANG Fei, GE Hongyi
    Computer Engineering and Applications    2023, 59 (7): 15-30.   DOI: 10.3778/j.issn.1002-8331.2205-0503
    Abstract227)      PDF(pc) (659KB)(109)       Save
    Graph neural network (GNN) is a deep learning-based model for processing graph-structured data, which has received much attention from researchers for its good interpretability and powerful nonlinear fitting ability to graph-structured data. With the rise of GNN, GNN has been developed to integrate with image processing techniques and has made breakthroughs in image classification, human body analysis and visual quizzing. Firstly, image processing techniques and the theory of traditional neural networks are introduced, and the principles, characteristics and shortcomings of five major classes of GNNs are analyzed. Secondly, the applications of GNN in the image processing field from five technical levels are analyzed respectively, and the representative models of each class of methods are listed. Thirdly, the common models described in the paper are compared and summarized from the perspective of both datasets and performance evaluation metrics, and nine common public datasets in image processing are introduced in addition. Finally, areas for improvement in GNN in the field of image processingare analyzed in depth, and the prospects of its application in the field of image processing are presented.
    Reference | Related Articles | Metrics
    Small Object Detection Algorithm Based on Improved YOLOv5 in UAV Image
    XIE Chunhui, WU Jinming, XU Huaiyu
    Computer Engineering and Applications    2023, 59 (9): 198-206.   DOI: 10.3778/j.issn.1002-8331.2212-0336
    Abstract224)      PDF(pc) (808KB)(176)       Save
    UAV aerial images have many characteristics, such as large-scale changes and complex backgrounds, so it is difficult for the existing detectors to detect small objects in aerial images. Aiming at the problem of mistake detection and omission, a small object detection algorithm model Drone-YOLO is proposed. A new detection branch is added to improve the detection capability at multiple scales, meanwhile the model contains a novel feature pyramid network with multi-level information aggregation, which realizes the fusion of cross-layers information. Then a feature fusion module based on multi-scale channel attention mechanism is designed to improve the focus on small objects. The classification task of the prediction head is decoupled from the regression task, and the loss function is optimized using Alpha-IoU to improve the accuracy of detection. The experimental results of VisDrone dataset show that the Drone-YOLO has improved the AP50 by 4.91?percentage points compared with the YOLOv5, and the inference time is only 16.78?ms. Compared with other mainstream models, it has a better detection effect for small targets, and can effectively complete the task of small target detection in UAV aerial images.
    Reference | Related Articles | Metrics
    CoT-TransUNet:Lightweight Context Transformer Medical Image Segmentation Network
    YANG He, BAI Zhengyao
    Computer Engineering and Applications    2023, 59 (3): 218-225.   DOI: 10.3778/j.issn.1002-8331.2205-0046
    Abstract219)      PDF(pc) (645KB)(124)       Save
    Aiming at the problem that the receptive field of convolution in the previous medical image segmentation network is too small and the feature loss of Transformer, an end-to-end lightweight context Transformer medicalimage segmentation network(lightweight context Transformer medical image segmentation network, CoT-TransUNet) is proposed. The network consists of three parts:encoder,decoder, and skip connections. For the input image,the encoder uses the CoTNet as a feature extractor to generate feature maps. Transformer blocks encode feature maps as input sequences. Then, the decoder upsamples the encoded features through a cascaded upsampler. The upsampler cascades multiple upsampling blocks, each of which employs the CARAFE upsampling operator. Finally, feature aggregation of the encoder and decoder at different resolutions is achieved through skip connections. CoT-TransUNet adopts CoTNet which combines global and local context information in the feature extraction stage. CARAFE operator with larger receptive field is adopted in the upsampling stage. It generates better input feature maps, as well as content-based upsampling, while remaining lightweight. Experiments on multi-organ segmentation tasks show that CoT-TransUNet achieves better performance than other networks.
    Reference | Related Articles | Metrics
    Research on Lightweight of Improved YOLOv5s Track Obstacle Detection Model
    LI Ang, SUN Shijie, ZHANG Zhaoyang, FENG Mingtao, WU Chengzhong, LI Wang
    Computer Engineering and Applications    2023, 59 (4): 197-207.   DOI: 10.3778/j.issn.1002-8331.2208-0045
    Abstract215)      PDF(pc) (910KB)(98)       Save
    Aiming at the shortcomings of the traditional train track obstacle detection methods with poor real-time performance and low detection accuracy for small targets, a lightweight obstacle detection model based on improved YOLOv5s detection network is proposed. Firstly, a more lightweight Mixup data enhancement method is introduced to replace the original Mosaic data enhancement method. Secondly, the deep separable convolution GhostConv in the GhostNet network structure is introduced to replace the ordinary convolution layer in the feature extraction network and feature fusion network in the original YOLOv5s model, which reduces the computational overhead of the model. The CA spatial attention mechanism is added to the end of the model feature extraction network, which reduces the loss of important location information in the training process of the algorithm and makes up for the loss of detection accuracy caused by improved GhostNet. Finally, sparse training and channel pruning are performed on the improved model to prune away the channels that have little influence on the detection accuracy, while retaining important feature information to make the model more lightweight. The experimental results show that, compared with the original YOLOv5s algorithm, the model size of the improved model is reduced by 9.7 MB, the detection speed is increased by 14 FPS, and the detection accuracy is improved by 1.0 percentage point on the self-made diversified rail transit dataset. At the same time, compared with the current mainstream detection algorithm, the detection accuracy and detection speed also have some advantages, which is suitable for the obstacle target detection in complex rail transit environment.
    Reference | Related Articles | Metrics
    Survey on Emotion Recognition in Conversation
    CHEN Xiaoting, LI Shi
    Computer Engineering and Applications    2023, 59 (3): 33-48.   DOI: 10.3778/j.issn.1002-8331.2207-0417
    Abstract214)      PDF(pc) (681KB)(103)       Save
    Emotion recognition in conversation(ERC) is a hot research topic in the field of emotion computing, which aims to detect the emotion category of each discourse during the dialogue. It has important research significance for dialogue understanding and dialogue generation. At the same time, it has a wide range of practical application value in many fields, such as social media analysis, recommendation system, medical treatment and human-computer interaction. With the continuous innovation and development of deep learning technology, emotion recognition in conversation has attracted more and more attention from academia and industry. At this stage, it is necessary to summarize these research results in an overview article in order to better carry out follow-up work. The research results in this field are comprehensively sorted out from the perspectives of problem definition, problem approach, research methods, and mainstream datasets, and the development of dialogue emotion recognition tasks is reviewed and analyzed. Compared with video and audio, dialogue text contains more information. Therefore, this paper focuses on combing the text dialogue emotion recognition methods, especially the methods based on deep learning. Finally, based on the current research status, this paper summarizes the open problems existing in the field of dialogue emotion recognition and the development trend in the future.
    Reference | Related Articles | Metrics
    Improved Traffic Sign Detection Algorithm for YOLOv5
    HU Zhaohua, WANG Ying
    Computer Engineering and Applications    2023, 59 (1): 82-91.   DOI: 10.3778/j.issn.1002-8331.2207-0307
    Abstract214)      PDF(pc) (1654KB)(108)       Save
    Traffic sign detection is an important link in the fields of automatic driving and assisted driving, which is related to driving safety. Aiming at the difficulties of small targets and complex backgrounds in traffic signs, an algorithm based on improved YOLOv5 is proposed. Firstly, a regional context module is proposed, which uses dilated convolutions with various dilation rates to obtain different receptive fields, and then obtains the feature information of the target and its adjacent areas. The information of adjacent areas plays an important role in small objects detection in traffic signs. It can effectively solve the problem of small targets. Secondly, a feature enhancement module is introduced in the backbone part to further improve the feature extraction ability of the backbone, and the attention mechanism is combined with the original C3 module to make the network more focused on small target information and avoid complex backgrounds. Finally, in the multiscale detection part, the feature fusion of the shallow feature layer and the deep detection layer can take into account both the shallow position information and the deep semantic information, increase the target positioning accuracy and boundary regression, and is more conducive to small target detection. The experimental results show that the improved algorithm achieves 87.2% small target detection precision, 92.4% small target recall and 91.8% mAP on the traffic sign detection data set TT100K, which is improved by 3.5, 4.1 and 2.6 percentage points respectively compared with the original YOLOv5 algorithm, detection speed 83.3?frame/s. On the CCTSDB dataset, mAP is 98.0%, increases 2.0 percentage points, and the detection speed is 90.9?frame/s. Therefore, the proposed improved YOLOv5 algorithm can effectively improve the traffic signs detection precision and recall, and the detection speed is comparable.
    Reference | Related Articles | Metrics
    Image Super-Resolution with Light-Weighted Pyramid Pooling-Based Attention Network
    FANG Jinsheng, ZHU Gupei
    Computer Engineering and Applications    2022, 58 (20): 197-205.   DOI: 10.3778/j.issn.1002-8331.2203-0266
    Abstract212)      PDF(pc) (1347KB)(107)       Save
    In the task of image super-resolution reconstruction based on deep learning, most of the current algorithms improve their performance by expanding the network scale, which leads to the increase of computing resources. To solve the problem mentioned above, a light-weighted pyramid pooling-based attention network(LiPAN) is proposed which is composed of information distillation block, pyramid pooling and backward attention fusion module. The attention mechanism ensures that the network extracts important features, the pyramid pooling structure can get more context information and obtain more accurate reconstruction results, and the distillation structure can effectively improve the network performance and reduce network parameters. Compared with state-of-the-art lightweight network models, quantitative evaluation among the scale factor of 2, 3, 4 on four public datasets, including Set5, Set14, BSD100 and Urban100, the proposed LiPAN model is able to achieve superior PSNR and SSIM values. It is shown that LiPAN has better super-resolution reconstruction performance when the model parameter is comparable to the current mainstream light-weighted network.
    Reference | Related Articles | Metrics
    Overview of Image Edge Detection
    XIAO Yang, ZHOU Jun
    Computer Engineering and Applications    2023, 59 (5): 40-54.   DOI: 10.3778/j.issn.1002-8331.2209-0122
    Abstract210)      PDF(pc) (921KB)(128)       Save
    The task of edge detection is to identify pixels with significant brightness changes as target edges, which is a low-level problem in computer vision, and edge detection has important applications in object recognition and detection, object proposal generation, and image segmentation. Nowadays, edge detection has produced several types of methods, such as traditional gradient-based detection methods and deep learning-based edge detection algorithms and detection methods combined with emerging technologies. A finer classification of these methods provides researchers with a clearer understanding of the trends in edge detection. Firstly, the theoretical basis and implementation methods of traditional edge detection are introduced; then the main edge detection methods in recent years are summarized and classified according to the methods used, and the core techniques used in them are introduced, such as branching structure, feature fusion and loss function. The evaluation indicators used to assess the algorithm’s performance are single-image optimal threshold(ODS) and frame per second(FPS), which are contrasted using the fundamental data set(BSDS500). Finally, the current state of edge detection research is examined and summarized, and the possible future research directions of edge detection are prospected.
    Reference | Related Articles | Metrics
    Review of Research on Application of Vision Transformer in Medical Image Analysis
    SHI Lei, JI Qingyu, CHEN Qingwei, ZHAO Hengyi, ZHANG Junxing
    Computer Engineering and Applications    2023, 59 (8): 41-55.   DOI: 10.3778/j.issn.1002-8331.2206-0022
    Abstract208)      PDF(pc) (869KB)(138)       Save
    Deep self-attentive network(Transformer) has a natural ability to model global features and long-range correlations of input information, which is strongly complementary to the inductive bias property of convolutional neural networks(CNN). Inspired by its great success in natural language processing, Transformer has been widely introduced into various computer vision tasks, especially medical image analysis, and has achieved remarkable performance. In this paper, it first introduces the typical work of vision Transformer on natural images, and then organizes and summarizes the related work according to different lesions or organs in the subfields of medical image segmentation, medical image classification and medical image registration, focusing on the implementation ideas of some representative work. Finally, current researches are discussed and the future direction is pointed out. The purpose of this paper is to provide a reference for further in-depth research in this field.
    Reference | Related Articles | Metrics
    Research Advances on Graph Neural Network Recommendation of Knowledge Graph Enhancement
    WU Guodong, WANG Xueni, LIU Yuliang
    Computer Engineering and Applications    2023, 59 (4): 18-29.   DOI: 10.3778/j.issn.1002-8331.2205-0268
    Abstract208)      PDF(pc) (638KB)(155)       Save
    The existing recommendation methods are mainly based on the users’ historical interaction behavior, and the user and item-related feature information are not fully utilized, resulting in the effect of the recommendation is not ideal. The graph neural network(GNN) recommendation enhanced by knowledge graph(KG) is based on the interaction graph constructed by user and item interaction behavior, and the knowledge graph with the same graph structure is introduced and processed by the graph neural network technology, so as to realize personalized recommendation. In this paper, the research progress of graph neural network recommendation enhanced by existing knowledge graph is discussed. Firstly, on the basis of the discussion of graph neural network recommendation and knowledge graph recommendation, the relevant research results of graph neural network recommendation enhanced by the current knowledge graph are deeply analyzed from the aspects of item knowledge graph and collaborative knowledge graph. Then, the main problems in the graph neural network recommendation research based on the existing knowledge graph enhancement are pointed out from the aspects of large-scale dynamic knowledge graph processing, user preference mining for item attributes, knowledge graph embedding learning problem and so on. Finally, the main research directions of GNN recommendation enhanced by knowledge graph in the future are predicted from the following aspects:GNN recommendation enhanced by knowledge graph in dynamic sequential sequence, GNN recommendation enhanced by knowledge graph in meta-learning, GNN recommendation enhanced by multi-model knowledge graph, GNN cross-domain recommendation enhanced by knowledge graph and so on.
    Reference | Related Articles | Metrics
    Improved YOLOv5’s Foreign Object Debris Detection Algorithm for Airport Runways
    LI Xiaojun, DENG Yueming, CHEN Zhenghao, HE Xin
    Computer Engineering and Applications    2023, 59 (2): 202-211.   DOI: 10.3778/j.issn.1002-8331.2207-0462
    Abstract208)      PDF(pc) (5308KB)(116)       Save
    Aiming at the problem that the foreign object debris(FOD) of the airport runway has a small proportion of the target in the image and the features are not obvious, which often leads to false detection and missed detection, an improved YOLOv5 FOD target detection algorithm is proposed. Firstly, it improves the multi-scale fusion and detection part, fuses high-resolution feature maps to enhance the feature expression of small targets, and removes the large target detection layer to reduce the computational complexity of network reasoning. Secondly, a lightweight and efficient convolutional attention module(CBAM) is introduced. It improves the ability of the model to focus on target features from the two dimensions of space and channel. Then it uses the RepVGG module in the feature fusion stage to improve the feature fusion ability of the model and improve the detection accuracy. Finally, it uses SIoU Loss as the loss function to improve the speed and precision of bounding box regression. Comparative experiments are carried out on the self-made FOD data set. The results show that the method achieves 95.01% mAP50 and 55.79% mAP50∶95 under the condition of real-time performance, which is 2.78 and 3.28 percentage points higher than the original algorithm YOLOv5, respectively. It effectively solves the problems of false detection and missed detection of traditional FOD detection. At the same time, compared with mainstream target detection algorithms, the proposed improved algorithm is more suitable for FOD detection tasks.
    Reference | Related Articles | Metrics
    Construction and Application of Discipline Knowledge Graph in Personalized Learning
    ZHAO Yubo, ZHANG Liping, YAN Sheng, HOU Min, GAO Mao
    Computer Engineering and Applications    2023, 59 (10): 1-21.   DOI: 10.3778/j.issn.1002-8331.2209-0345
    Abstract206)      PDF(pc) (929KB)(187)       Save
    The discipline knowledge graph is an important tool to support teaching activities based on big data, artificial intelligence and other technologies. As a kind of discipline knowledge semantic network, it contributes to the development of personalized learning systems and the promotion of new infrastructure for digital education resources. Firstly, this paper outlines the concept and classification of knowledge graph. Secondly, this paper summarizes the concept, characteristics, advantages, connotation and the support for personalized learning of discipline knowledge graph. Nextly, this paper focuses on the sorting of construction process of discipline knowledge graph:discipline ontology construction, discipline knowledge extraction, discipline knowledge fusion and discipline knowledge processing, and it also introduces the application of discipline knowledge graph in personalized learning situations and the challenges. Finally, this paper prospects the future tendency of discipline knowledge graph and personalized learning. It provides the reference and inspiration for the organization of educational resources and the innovative development of personalized learning.
    Reference | Related Articles | Metrics
    Review of Research on Adapter and Prompt Tuning
    LIN Lingde, LIU Na, WANG Zheng'an
    Computer Engineering and Applications    2023, 59 (2): 12-21.   DOI: 10.3778/j.issn.1002-8331.2209-0025
    Abstract205)      PDF(pc) (3579KB)(124)       Save
    Text mining is a branch of data mining, covering a variety of technologies, among which natural language processing technology is one of the core tools of text mining, which aims to help users obtain useful information from massive data. In recent years, the pre-training model has played an important role in promoting the research and development of natural language processing, and the fine-tuning method of the pre-training model has also become an important research field. On the basis of the relevant literature on the pre-training model fine-tuning method published in recent years, this paper reviews the current mainstream Adapter and Prompt methods. First of all, the development of natural language processing is briefly combed, and the problems and difficulties in fine-tuning of pre-training models are analyzed. Secondly, two kinds of fine-tuning methods:Adapter and Prompt, and the classic methods in the this two research directions are introduced. The advantages, disadvantages and performance are analyzed and summarized. Finally, this paper summarizes the limitations of the current fine-tuning methods of the pre-training model and discusses the future development direction.
    Reference | Related Articles | Metrics
    Review of Research on Driver Fatigue Driving Detection Methods
    ZHANG Rui, ZHU Tianjun, ZOU Zhiliang, SONG Rui
    Computer Engineering and Applications    2022, 58 (21): 53-66.   DOI: 10.3778/j.issn.1002-8331.2204-0053
    Abstract205)      PDF(pc) (946KB)(139)       Save
    The proportion of traffic accidents caused by fatigue driving has increased year by year, which has attracted widespread attention from researchers. At present, the research of fatigue driving testing is limited by various factors such as scientific and technological level, environment, and road, which makes it difficult to further develop fatigue driving detection technology. This article introduces the latest progress in driver fatigue driving detection methods in the past decade. The two categories of active detection method and passive detection method are elaborated and reviewed. According to the different characteristics of the two major types of detection methods, it is carefully classified. The advantages and limitations of various fatigue driving detection methods are further analyzed, and the detection algorithms used in the active detection method based on facial features in the past three years are analyzed and summarized. Finally, the shortcomings of various fatigue driving detection methods are summarized, and the future research trends in the field of fatigue detection are proposed, which provides new ideas for researchers to further research.
    Reference | Related Articles | Metrics
    Review on Application of Deep Learning in Helmet Wearing Detection
    GAO Teng, ZHANG Xianwu, LI Bai
    Computer Engineering and Applications    2023, 59 (6): 13-29.   DOI: 10.3778/j.issn.1002-8331.2207-0434
    Abstract199)      PDF(pc) (832KB)(151)       Save
    Driven by deep learning, many approaches to object detection have made great progress in the field of industrial security, and the study of helmet-wearing detection has gradually become a significant topic in intelligent image recognition. In order to comprehensively analyze the research status of deep learning technology in helmet wearing detection task, and to facilitate follow-up scientific research personnel to carry out research work, this paper analyzes the state-of-the-art helmet-wearing detection algorithms under deep learning conditions proposed by domestic and foreign scholars in recent years and compares their advantages and limitations. This paper is structured in three sections:the establishment and usage of databases, the predominate algorithms for helmet-wearing detection, the current challenges in the field of helmet-wearing detection. The future research direction of helmet wearing detection field is prospected, and the future research focus in this field is put forward.
    Reference | Related Articles | Metrics
    Wearing Mask Pedestrian Tracking Based on Improved YOLOv7 and DeepSORT
    ZHAO Yuanlong, SHAN Yugang, YUAN Jie
    Computer Engineering and Applications    2023, 59 (6): 221-230.   DOI: 10.3778/j.issn.1002-8331.2210-0479
    Abstract198)      PDF(pc) (1007KB)(90)       Save
    A pedestrian tracking algorithm based on improved YOLOv7 and DeepSORT is proposed to solve the problem that whether pedestrians wear masks cannot be correctly judged due to face occlusion and missed detection in video sequences. The algorithm combines mask detection, pedestrian detection and tracking. Firstly, by adding attention mechanism to the backbone network of YOLOv7, shallow feature maps are added to enhance the network’s ability to perceive small targets and improve the accuracy of mask detection and pedestrian detection. Secondly, the intra-frame relationship module uses the Hungarian algorithm to correlate the intra-frame targets and mark the mask wearing of pedestrians. Then, the direction difference factor is added to the association cost of the DeepSORT algorithm to eliminate the inconsistency between the historical detection direction and the new detection speed direction of the tracking trajectory. Finally, the improved DeepSORT algorithm is used to track pedestrians and update the mask wearing mark for each track, achieve tracking of pedestrians wearing masks and those not wearing masks. The experimental results show that the average detection accuracy mAP50 of the improved YOLOv7 network is 3.83 percentage points higher than that of the original algorithm. On the MOT16 dataset, the tracking accuracy MOTA of this algorithm is 17.1 percentage points higher than that of DeepSORT algorithm, and the tracking precision MOTP is increased by 2.6% percentage points. Compared with the detection algorithm, this algorithm can track more pedestrians whether wearing masks, and has better results.
    Reference | Related Articles | Metrics
    Hybrid CTC/Attention End-to-End Chinese Speech Recognition Enhanced by Conformer
    CHEN Ge, XIE Xukang, SUN Jun, CHEN Qidong
    Computer Engineering and Applications    2023, 59 (4): 97-103.   DOI: 10.3778/j.issn.1002-8331.2111-0462
    Abstract194)      PDF(pc) (568KB)(68)       Save
    Recently, the Transformer structure based on self-attention has shown very good performance on a series of tasks in different fields. Firstly, the effect of speech recognition model Transformer-LAS based on Transformer encoder and LAS(listen,attend and spell) decoder is explored. And in view of the problem that Transformer is not good at capturing local information, a Conformer-LAS model that uses Conformer instead of Transformer is proposed for automatic speech recognition. Secondly, due to the excessively flexible alignment of Attention, its effect in a noisy environment will drop sharply, the connectionist temporal classification(CTC) is used to assist training to speed up the convergence, the joint optimization of the intermediate CTC loss at the phoneme level is joined, and a better Conformer-LAS-CTC speech recognition model is proposed. Finally, the proposed model is verified on the open source Chinese Mandarin Aishell-1 data set. The experimental results show that compared with the baseline BLSTM-LAS and Transformer-LAS models, the character error rate of Conformer-LAS-CTC on the test set is reduced by 22.58% and 48.76% respectively, and the final character error rate of the model is 4.54%.
    Reference | Related Articles | Metrics
    Review of Real-Time Semantic Segmentation Algorithms for Deep Learning
    HE Jiafeng, CHEN Hongwei, LUO Dehan
    Computer Engineering and Applications    2023, 59 (8): 13-27.   DOI: 10.3778/j.issn.1002-8331.2210-0144
    Abstract193)      PDF(pc) (1161KB)(160)       Save
    Semantic segmentation is a technique to segment different objects in a picture from the perspective of pixels and label each pixel in the original picture. However, due to UAV navigation, remote sensing images, medical diagnosis and other application fields, real-time semantic segmentation is needed. Therefore, the real-time semantic segmentation technology based on deep learning has developed rapidly. There are many technologies and models for real-time semantic segmentation. Based on this, on the basis of studying the related literature, the real-time semantic segmentation technology is introduced by semantic segmentation technology, and the advantages of real-time semantic segmentation are briefly described. Then, the important and difficult points of real-time semantic segmentation are discussed. According to the important and difficult points, the existing related technologies and models are expounded, and the advantages and disadvantages of the technologies and models are summarized. Finally, the challenges faced by real-time semantic segmentation are prospected, and the real-time semantic segmentation is summarized, which provides some theoretical references for the follow-up discussion.
    Reference | Related Articles | Metrics
    Survey of Short Text Classification Methods Based on Deep Learning
    GAN Yating, AN Jianye, XU Xue
    Computer Engineering and Applications    2023, 59 (4): 43-53.   DOI: 10.3778/j.issn.1002-8331.2209-0048
    Abstract192)      PDF(pc) (609KB)(96)       Save
    From five aspects of CNN, RNN, CNN-RNN, GCN and other deep learning methods, the research status of their application in short text classification is comprehensively analyzed, their advantages and disadvantages are compared, and the commonly used labeled datasets are summarized. The results show that:At present, the application research of deep learning in short text classification mainly focuses on the improvement of efficient algorithms and the expansion of text information. At the same time, the research on constructing labeled datasets for model testing is in the initial stage, mostly for specific fields such as movie reviews, commodity reviews, news, etc., which needs continuous improvement. In the future, the research will focus on algorithm improvement, information expansion and their mutual integration, to explore some specific applications with good classification effect in practice.
    Reference | Related Articles | Metrics
    Overview of Table Detection and Structure Recognition
    ZHANG Yutong, LI Qiyuan, LIU Shukan
    Computer Engineering and Applications    2022, 58 (22): 1-11.   DOI: 10.3778/j.issn.1002-8331.2206-0337
    Abstract192)      PDF(pc) (859KB)(155)       Save
    In view of the current development of table analysis in document analysis, the recent literature relevant to this field is sorted out, and the two key tasks, table detection and table structure recognition, are studied. For table detection, methods are divided into those based on object detection, graph neural network, generative adversarial network and deformable convolutional network. For table structure recognition, methods include those based on object detection, graph neural network, recurrent neural network, deformable convolutional and dilated convolutional network. The methods and limitations of various models are summarized, and the related tasks and their corresponding datasets are sorted out. The common open-source datasets in table analysis are summarized more widely, and the source, scale, scope of application and file type of each dataset are introduced in detail. The commonly used evaluation metrics in table analysis are listed, and the experimental results of existing models are compared in respect of different experimental datasets. The current development of table analysis is summarized, and the future tendency is discussed.
    Reference | Related Articles | Metrics
    Review of Musical Instrument Recognition in Music Information Retrieval
    PEI Wenbin, WANG Hailong, LIU Lin, PEI Dongmei
    Computer Engineering and Applications    2023, 59 (2): 34-47.   DOI: 10.3778/j.issn.1002-8331.2205-0492
    Abstract191)      PDF(pc) (4853KB)(50)       Save
    Efficient and accurate instrument recognition technology can effectively promote the in-depth development of sound source separation, music spectrum recognition, music genre classification and other research, and can be widely used in many fields, such as playlist generation, acoustic environment classification, instrument intelligent teaching and interactive multimedia. In recent years, with the continuous advancement of musical instrument recognition research, the performance of musical instrument recognition system has been greatly improved, but there are still many problems, such as difficult recognition of some musical instruments, difficult extraction of musical instrument audio features, and low recognition accuracy of complex musical instrument. How to identify musical instruments efficiently and accurately with the help of artificial intelligence technology has become the focus and difficulty of current research. According to the current research status, this paper summarizes the commonly used audio features, musical instrument recognition models and methods and commonly used data sets of musical instrument recognition, and summarizes the limitations and future develo-
    pment trend of the current research, so as to provide some reference for the research of musical instrument recognition.
    Reference | Related Articles | Metrics
    Survey of Research on Deep Multimodal Representation Learning
    PAN Mengzhu, LI Qianmu, QIU Tian
    Computer Engineering and Applications    2023, 59 (2): 48-64.   DOI: 10.3778/j.issn.1002-8331.2206-0145
    Abstract191)      PDF(pc) (6521KB)(159)       Save
    Although deep learning has been widely used in many fields because of its powerful nonlinear representation capabilities, the structural and semantic gap between multi-source heterogeneous modal data seriously hinders the application of subsequent deep learning models. Many scholars have proposed a large number of representation learning methods to explore the correlation and complementarity between different modalities, and improve the performance of deep learning prediction and generalization. However, the research on multimodal representation learning is still in its infancy, and there are still many scientific problems to be solved. So far, multimodal representation learning still lacks a unified cognition, and the architecture and evaluation metrics of multimodal representation learning research are not fully clear. According to the feature structure, semantic information and representation ability of different modalities, this paper studies and analyzes the progress of deep multimodal representation learning from the perspectives of representation fusion and representation alignment. And the existing research work is systematically summarized and scientifically classified. At the same time, this paper analyzes the basic structure, application scenarios and key issues of representative frameworks and models, analyzes the theoretical basis and latest development of deep multimodal representation learning, and points out the current challenges and future development of multimodal representation learning research, to further promote the development and application of deep multimodal representation learning.
    Reference | Related Articles | Metrics
    Research Progress on Vision System and Manipulator of Fruit Picking Robot
    GOU Yuanmin, YAN Jianwei, ZHANG Fugui, SUN Chengyu, XU Yong
    Computer Engineering and Applications    2023, 59 (9): 13-26.   DOI: 10.3778/j.issn.1002-8331.2209-0183
    Abstract190)      PDF(pc) (787KB)(125)       Save
    Fruit picking robot is of great significance to the realization of automatic intelligence of fruit equipment. In this paper, the research work on the key technologies of fruit-picking robot at home and abroad in recent years is summarized, firstly, the key technologies of fruit-picking robot vision system, such as traditional image segmentation methods based on fruit features, such as threshold method, edge detection method, clustering algorithm based on color features and region-based image segmentation algorithm, are discussed, the object recognition algorithm based on depth learning and the target fruit location are analyzed and compared, and the state-of-the-art of fruit picking robot manipulator and end-effector is summarized, finally, the development trend and direction of fruit-picking robot in the future are prospected, which can provide reference for the related research of fruit-picking robot.
    Reference | Related Articles | Metrics