计算机工程与应用 ›› 2022, Vol. 58 ›› Issue (8): 125-135.DOI: 10.3778/j.issn.1002-8331.2203-0479

• 模式识别与人工智能 • 上一篇    下一篇

咽拭子采集机器人表情识别与交互

郭馨蔚,马楠,刘伟锋,孙富春,张津丽,陈洋,张国平   

  1. 1.北京科技大学 机械工程学院,北京 100083
    2.北京工业大学 信息学部,北京 100124
    3.北京联合大学 北京市信息服务工程重点实验室,北京 100101
    4.清华大学 计算机科学与技术系,北京 100084
  • 出版日期:2022-04-15 发布日期:2022-04-18

Expression Recognition and Interaction of Pharyngeal Swab Collection Robot

GUO Xinwei, MA Nan, LIU Weifeng, SUN Fuchun, ZHANG Jinli, CHEN Yang, ZHANG Guoping   

  1. 1.School of Mechanical Engineering, University of Science and Technology Beijing, Beijing 100083, China
    2.Department of Informatics, Beijing University of Technology, Beijing 100124, China
    3.Beijing Key Laboratory of Information Service Engineering, Beijing Union University, Beijing 100101, China
    4.Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China
  • Online:2022-04-15 Published:2022-04-18

摘要: 咽拭子采集机器人对采集人的表情识别、情绪表达等感知交互,对于咽拭子采集机器人成功执行任务具有重要的意义。咽拭子机器人采集到的数据样本在训练过程中要解决遮挡去除及微表情识别的拓展、自采集数据集的定义、关键帧识别及处理、表情分类及划定、交互集成等关键问题。针对以上问题,设计了一种高效自修复网络(efficient self-cured network,ESCN),构建了具有类识别人脸能力的特征提取加权模型,通过多尺度注意力机制捕捉受试者的面部表情区,通过重标签校正的线性聚合模型识别咽拭子采集者的面部表情。实验在真实数据集RAF-DB、FER2013和自采集数据集上进行验证。实验结果表明,ESCN比经典模型的准确率提升4.643~11.058个百分点,同时参数量相对较小,便于轻量化和集成。

关键词: 咽拭子采集, 表情识别, 特征提取, 感知交互

Abstract: The recognition of the facial expression of the collected person of pharyngeal swabs, and the interactive control of the human emotional expression have important significance for the successful collection of pharyngeal swabs of the pharyngeal swab collection robot. When training the data sampled by the throat swab robot, there are some problems, such as the expansion of occlusion removal and micro-expression recognition, the definition of self-collected data sets, key frame recognition and processing, expression classification and delimitation, and interactive integration. To solve these problems, it proposes an effective framework named efficient self-cured network(ESCN). First, a weighted feature extraction model with the ability to recognize faces is constructed. Second, the subjects are captured by a multi-scale attention mechanism. Finally, the facial expressions of throat swab collectors are identified by a relabel-corrected linear aggregation model. Experiments are validated on real datasets RAF-DB, FER2013 and self-collected datasets. The experimental results show that the accuracy of ESCN is 4.643~11.058 percentage points higher than the classic model. The parameter amount is relatively small, which is convenient for lightweight and integration.

Key words: pharyngeal swab collection, facial expression recognition, feature extraction, perceptual interaction