计算机工程与应用 ›› 2014, Vol. 50 ›› Issue (9): 103-106.

• 图形图像处理 • 上一篇    下一篇

基于时序上下文的视频场景分类

彭太乐1,2,张文俊3,丁友东3,郭桂芳2   

  1. 1.上海大学 通信与信息工程学院,上海 200072
    2.淮北师范大学 计算机科学与技术学院,安徽 淮北 235000
    3.上海大学 影视艺术技术学院,上海 200072
  • 出版日期:2014-05-01 发布日期:2014-05-14

Video classification based on time series contextual information

PENG Taile1,2, ZHANG Wenjun3, DING Youdong3, GUO Guifang2   

  1. 1.School of Communication & Information Engineering, Shanghai University, Shanghai 200072, China
    2.School of Computer Science and Technology, Huaibei Normal University, Huaibei, Anhui 235000, China
    3.School of Film and TV Arts & Technology, Shanghai University, Shanghai 200072, China
  • Online:2014-05-01 Published:2014-05-14

摘要: 以传统的词袋模型为基础,根据相邻镜头关键帧之间具有相关性的特点提出了一种用于视频场景分类的模型。将视频片段进行分割,提取关键帧,对关键帧图像归一化。将关键帧图像作为图像块以时序关系合成新图像,提取新图像的SIFT特征及HSV颜色特征,将图像的SIFT特征及HSV颜色特征数据映射到希尔伯特空间。通过多核学习,选取合适的核函数组对每个图像进行训练,得到分类模型。通过对多种视频进行实验,实验结果表明,该方法在视频场景分类中能取得很好的效果。

关键词: 时序上下文特征, 尺度不变特征变换(SIFT)特征, HSV颜色特征, 多核学习

Abstract: On the basis of traditional bag of word model, according to the spatial and semantic similarity between the key frames of adjacent lens, this paper brings a new video scene classification model. It divides video clips into many shots and extracts their key frames and makes the key frames a gauge. The next thing is that the key frames as an image block produces an image on time sequence. SIFT features and HSV feature are extracted. This paper embeds the SIFT features and HSV feature data into Hilbert space. Through multi kernel learning, the algorithm selects the appropriate kernel functions to train each image, and gets the classification model. Experiments show that the proposed algorithm for video classification can achieve better performance.

Key words: time series contextual character, Scale-Invariant Feature Transform(SIFT) character, HSV character, multi kernel learning