计算机工程与应用 ›› 2015, Vol. 51 ›› Issue (23): 230-235.

• 工程与应用 • 上一篇    下一篇

基于SVM结合依存句法的金融领域舆情分析

黄  进1,阮  彤1,蒋锐权2   

  1. 1.华东理工大学 信息学院,上海 200237
    2.上海证券交易所 技术部,上海 200120
  • 出版日期:2015-12-01 发布日期:2015-12-14

Sentiment analysis in financial domain based on SVM with dependency syntax

HUANG Jin1, RUAN Tong1, JIANG Ruiquan2   

  1. 1.School of Information Science and Engineering, East China University of Science and Technology, Shanghai 200237, China
    2.Department of Technology, Shanghai Stock Exchange, Shanghai 200120, China
  • Online:2015-12-01 Published:2015-12-14

摘要: 用户的情感倾向与市场波动之间的联系,对金融市场的监控和股价异常处理有着重要作用,因此针对金融领域用户生成的文本进行情感分析很有意义。然而,由于金融领域文本的术语比较多,句子比较长,以及缺乏现成的情感语料库,所以针对该领域的情感分析研究目前还比较少。根据金融领域文本的特点,充分考虑到金融领域情感词的特征、单个句子中词语的位置权重以及情感词相互间的修饰关系,提出SVM分类结合Stanford句法依存分析方法,计算文档的情感值。利用重要财经网站上抽取的金融领域数据进行实验,综合值F达到了82.1%,比文献中其他方法更为精准。

关键词: 金融领域, 情感分析, 位置关系, 支持向量机(SVM), 依存分析

Abstract: The linkages between users emotional tendencies and market fluctuations to monitor and handle the market price of exception play an important role, so the sentiment analysis of user-generated text in financial sector becomes meaningful. However, due to the longer sentences and term of the financial sector, and not many ready-made emotional corpus, sentiment analysis research in this field is still relatively small. Based on the characteristics of financial sector, it uses the SVM classification with Stanford syntactic dependency analysis to calculate the document emotional value which fully takes into account the characteristics of emotional words, the words position weights and the modification of relationship between each other. Through the experimentation online extraction data which from the important financial website, the integrated value of F reaches 82.1%, more accurate than other methods in the literature.

Key words: financial domain, sentiment analysis, positional relationship, Support Vector Machine(SVM), dependency parsing