Computer Engineering and Applications ›› 2021, Vol. 57 ›› Issue (21): 95-101.DOI: 10.3778/j.issn.1002-8331.2010-0085

Previous Articles     Next Articles

Parallel Selective Kernel Attention Based on HardSoftmax

ZHU Meng, MIN Weidong, ZHANG  Yu, DUAN Jingwen   

  1. 1.School of Information Engineering, Nanchang University, Nanchang 330031, China
    2.School of Software, Nanchang University, Nanchang 330047, China
    3.Jiangxi Key Laboratory of Smart City, Nanchang 330047, China
  • Online:2021-11-01 Published:2021-11-04



  1. 1.南昌大学 信息工程学院,南昌 330031
    2.南昌大学 软件学院,南昌 330047
    3.江西省智慧城市重点实验室,南昌 330047


Attention has been widely used in Convolutional Neural Networks(CNNs), and the performance of CNNs  is effectively improved. At the same time, attention is very lightweight, and almost does not need to change the original architecture of CNNs. This paper proposes Parallel Selective Kernel(PSK) attention based on HardSoftmax. Firstly, to solve the problem that Softmax contains exponential operation, which is easy to occur computational overflow for large positive inputs, this paper proposes computationally safer HardSoftmax to replace Softmax. Then, different from Selective Kernel(SK) attention which puts the extraction and transformation of global features after feature fusion, PSK attention puts it in one branch alone, thus being in parallel connection with multiple branches with different kernel sizes. Meanwhile, the transformation of global features uses group convolution to further reduce the number of parameters and Multiply Adds(MAdds). Finally, multiple branches with different kernel sizes are fused using HardSoftmax attention that is guided by the information in these branches. A wide range of image classification experiments show that just simply replacing Softmax with HardSoftmax can maintain or improve the performance of original attention. HardSoftmax also runs faster than Softmax in the experiments of this paper. PSK attention can match or outperform SK attention with less parameters and MAdds.

Key words: Convolutional Neural Networks(CNNs), HardSoftmax, Parallel Selective Kernel(PSK) attention



关键词: 卷积神经网络, HardSoftmax, 并行选择核注意力