Computer Engineering and Applications ›› 2022, Vol. 58 ›› Issue (15): 110-116.DOI: 10.3778/j.issn.1002-8331.2010-0081

• Big Data and Cloud Computing • Previous Articles     Next Articles

Deep Convolutional Neural Network Algorithm Based on Feature Map in Big Data Environment

MAO Yimin, ZHANG Ruipeng, GAO Bo   

  1. 1.School of Information Engineering, Jiangxi University of Science & Technology, Ganzhou, Jiangxi 341000, China
    2.Xi’an Geological Survey Center of China Geological Survey, Xi’an 710000, China
  • Online:2022-08-01 Published:2022-08-01

大数据下基于特征图的深度卷积神经网络

毛伊敏,张瑞朋,高波   

  1. 1.江西理工大学 信息工程学院,江西 赣州 341000
    2.中国地质调查局 西安地质调查中心,西安 710000

Abstract: Aiming at problems such as excessive network redundant parameters, poor parameter optimization ability and low parallel efficiency exist in DCNN(deep convolutional neural network) algorithm under big data environment. In this paper, a deep convolutional neural network algorithm based on feature graph and parallel computational entropy is proposed. The algorithm is MR-FPDCNN(deep convolutional neural network algorithm based on feature graph and parallel computing entropy using MapRuduce). The algorithm designs the FMPTL(feature map pruning based on Taylor loss) and the pre-training network to obtain the compressed DCNN, which effectively reduces the redundant parameters and also reduces the computational cost of DCNN training. This paper proposes the IFAS based on ISS, initializes DCNN parameters according to the “IFAS” algorithm, realizes the parallelization training of DCNN, and improves the optimization ability of network. In the Reduce phase, a DLBPCE(dynamic load balancing strategy based on parallel computing entropy) is proposed to obtain global training results, realizing fast uniform grouping of data and increasing the acceleration ratio of the parallel system. Experimental results show that this algorithm not only reduces the computational cost of DCNN training in big data environment, but also improves the parallelization performance of parallel system.

Key words: deep convolutional neural network(DCNN) algorithm, MapReduce framework, feature map pruning based on Taylor loss(FMPTL) strategy, IFAS algorithm, dynamic load balancing strategy based on parallel computing entropy(DLBPCE) strategy

摘要: 针对大数据环境下DCNN(deep convolutional neural network)算法中存在网络冗余参数过多、参数寻优能力不佳和并行效率低的问题,提出了大数据环境下基于特征图和并行计算熵的深度卷积神经网络算法MR-FPDCNN(deep convolutional neural network algorithm based on feature graph and parallel computing entropy using MapReduce)。该算法设计了基于泰勒损失的特征图剪枝策略FMPTL(feature map pruning based on Taylor loss),预训练网络,获得压缩后的DCNN,有效减少了冗余参数,降低了DCNN训练的计算代价。提出了基于信息共享搜索策略ISS(information sharing strategy)的萤火虫优化算法IFAS(improved firefly algorithm based on ISS),根据“IFAS”算法初始化DCNN参数,实现DCNN的并行化训练,提高网络的寻优能力。在Reduce阶段提出了基于并行计算熵的动态负载均衡策略DLBPCE(dynamic load balancing strategy based on parallel computing entropy),获取全局训练结果,实现了数据的快速均匀分组,从而提高了集群的并行效率。实验结果表明,该算法不仅降低了DCNN在大数据环境下训练的计算代价,而且提高了并行系统的并行化性能。

关键词: DCNN算法, MapReduce框架, FMPTL策略, IFAS算法, DLBPCE策略