Computer Engineering and Applications ›› 2024, Vol. 60 ›› Issue (23): 333-339.DOI: 10.3778/j.issn.1002-8331.2308-0344

• Engineering and Applications • Previous Articles     Next Articles

Clay Mineral Image Classification Using Fusion of Improved Residual Network and Attention Mechanism

DU Ruishan, CHEN Yuxin, MENG Lingdong, ZHANG Tong, CHENG Jiaxin   

  1. 1.School of Computer and Information Technology, Northeast Petroleum University, Daqing, Heilongjiang 163318, China
    2.Key Laboratory of Oil and Gas Reservoir and Underground Gas Storage Integrity Evaluations (Northeast Petroleum University), Daqing, Heilongjiang 163318, China
    3.Exploration and Development Research Institute, PetroChina Daqing Oilfield Limited Company, Daqing, Heilongjiang 163712, China
    4.School of Earth Sciences, Northeast Petroleum University, Daqing, Heilongjiang 163318, China
  • Online:2024-12-01 Published:2024-11-29

融合改进残差网络和注意力的黏土矿物图像分类

杜睿山,陈雨欣,孟令东,张桐,程佳薪   

  1. 1.东北石油大学 计算机与信息技术学院,黑龙江 大庆 163318
    2.油气藏及地下储库完整性评价黑龙江省重点实验室(东北石油大学),黑龙江 大庆 163318
    3.大庆油田有限责任公司勘探开发研究院,黑龙江 大庆 163712
    4.东北石油大学 地球科学学院,黑龙江 大庆 163318

Abstract: Clay minerals are hydrated aluminum silicate minerals that are widely present in sedimentary basins. Clay mineral identification using scanning electron microscopy (SEM) requires diverse and easily distinguishable characteristic information of clay minerals. However, the feature extraction capability of convolutional neural networks is still insufficient, and there is a need to enhance the expressive power of the features. Lower-level features usually contain basic information and details of images, while higher-level feature maps are more abstract and contain more abstract information. Therefore, a Res2Net50-based network with an efficient channel attention (ECA) module is proposed to make the model focus more on lower-level features and improve performance, and the last layer feature map is replaced by a multi-head self-attention (MHSA) module, which enhances feature representation ability. By combining the feature extraction capability of CNNs and the content self-attention mechanism of Transformers, the model’s feature representation capability is enhanced, surpassing that of Res2Net50. The results show that the proposed EM-Res2Net (ECA-MHSA-Res2Net50) has better performance in mineral identification of 128×128 pixel input, with a recognition accuracy of 92.85% as tested and compared with other models. This paper demonstrates that introducing ECA and MHSA modules into the model can fully extract image features and improve accuracy and efficiency.

Key words: deep learning, mineral identification, image classification, multi-scale features, channel attention, self-attention

摘要: 黏土矿物是广泛存在于沉积盆地中的含水铝硅酸盐矿物,通过扫描电子显微镜(scanning electron microscopy,SEM)获取黏土矿物在此基础上进行识别,需要获取黏土矿物的多样性和易于分辨的特征信息,然而卷积神经网络其特征提取能力仍存在不足,特征表达能力亟须增强。底层特征通常包含了图像的基本信息和细节,高层次的特征图抽象程度较高,包含了更为抽象化的信息。因此提出了一种以Res2Net50为主干网络,引入高效通道注意力(efficient channel attention,ECA)模块使模型更关注底层特征,从而提升模型性能,最后一层采用多头注意力(multi-head self-attention,MHSA)模块学习到不同层次的特征表示。采用混合方式同时利用了CNN的特征提取能力、Transformer的内容自注意力机制提升模型的特征表达能力,取得了优于Res2Net50的性能。结果表明:提出的EM-Res2Net(ECA-MHSA-Res2Net50)对128×128像素输入的矿物识别效果更为理想,经测试识别准确率为92.85%,通过与其他模型对比分析,该研究结果证明引入ECA模块和MHSA模块后的模型能够充分提取图像的特征,具有更高的准确率和时效性。

关键词: 深度学习, 矿物识别, 图像分类, 多尺度特征, 通道注意力, 自注意力