计算机工程与应用 ›› 2023, Vol. 59 ›› Issue (2): 306-313.DOI: 10.3778/j.issn.1002-8331.2108-0080

• 工程与应用 • 上一篇    下一篇

基于改进MFCC融合特征及FA-PNN的驾驶员路怒情绪识别

李尚卿,王晓原,张杨,李浩,项徽   

  1. 1.青岛科技大学 机电工程学院,山东 青岛 266000
    2.青岛科技大学 智能绿色制造技术与装备协同创新中心,山东 青岛 266000
  • 出版日期:2023-01-15 发布日期:2023-01-15

Driver Road Rage Recognition Based on Improved MFCC Fusion Feature and FA-PNN

LI Shangqing, WANG Xiaoyuan, ZHANG Yang, LI Hao, XIANG Hui   

  1. 1.School of Mechanical and Electrical Engineering, Qingdao University of Science and Technology, Qingdao, Shandong 266000, China
    2.Collaborative Innovation Center of Intelligent Green Manufacturing Technology and Equipment, Qingdao University of Science and Technology, Qingdao, Shandong 266000, China
  • Online:2023-01-15 Published:2023-01-15

摘要: 现今关于驾驶员路怒情绪识别方法中语音特性分析相对较少,该研究以路怒情绪为研究对象,利用模拟驾驶系统建立数据集,通过分析驾驶员语音的频谱特征,将时域中短时能量及短时过零率特征参数和改进Mel频率倒谱系数(Mel frequency cepstral coefficients,MFCC)特征参数融合构成特征参数向量,利用萤火虫算法(firefly algorithm,FA)优化PNN神经网络(probabilistic neural networks)并构建识别模型,实现驾驶员路怒情绪的识别。实验结果表明,在相同神经网络下,改进MFCC融合特征提取方法相比传统MFCC特征提取方法具有更好的抗噪性。同时,FA-PNN模型的识别准确率为93.0%,相比传统PNN模型提高了11个百分点;F1-Score值为0.932?8,提高了0.104?7。该研究论证了语音信号处理技术对驾驶员路怒情绪识别的可行性,为汽车主动安全驾驶预警研究提供了新方法。

关键词: 路怒情绪, 语音信号处理, FA-PNN, 改进MFCC, 特征融合

Abstract: At present, there are relatively few voice characteristic analysis methods for road rage recognition, taking the road rage as the research object, establishing the data set by virtue of the driving simulation system. At first, the author analyzes the spectral characteristics of the driver’s voice, with the characteristic parameters of short-time energy and short-time zero crossing rate in time domain and improved Mel-frequency cepstral coefficients(MFCC) features combined as the feature vector, and then builds the recognition model by optimized probabilistic neural networks(PNN) based on firefly algorithm(FA), then completes the recognition of drivers’ road rage. The results show that the combined feature extraction method is better in denoise than the traditional MFCC under the same neural network. Meanwhile, the recognition accuracy of FA-PNN model is 93.0%, which is 11 percentage points higher than that of traditional PNN model. The F1-score value is 0.932 8, improved by 0.104 7. This research demonstrates the feasibility of voice signal processing technology to recognize drivers’ road rage, and provides a new method for active vehicle security warning research.

Key words: road rage, voice signal processing, firefly algorithm-probabilistic neural networks(FA-PNN), improved Mel-frequency cepstral coefficients, feature fusion