Computer Engineering and Applications ›› 2007, Vol. 43 ›› Issue (14): 192-194.

• 工程与应用 • Previous Articles     Next Articles

Implementation of Supply and Demand Information Classification Based on Latent Semantic Indexing


  • Received:2006-09-22 Revised:1900-01-01 Online:2007-05-10 Published:2007-05-10


朱学昊 王儒敬   

  1. 中国科学院合肥智能机械研究所 中国科学技术大学自动化系 中国科学院合肥智能机械研究所
  • 通讯作者: 朱学昊

Abstract: This paper presents a new implementation of information retrieval and automatic classification. In order to overcome the shortage of traditional methods, an improved classification based on latent semantic indexing is introduced. LSI is a new retrieval model based on Singular Value Decomposition (SVD). Using the algorithm, every term will be either strengthened or weakened. When the latent semantic becomes clearer, it is easy to cut off most of the noisy data at the very beginning. So the accuracy of classification will be improved.

Key words: Latent Semantic Indexing, Singular Value Decomposition, Text Classification, Information Retrieval

摘要: 本文介绍了一种信息抽取和自动分类的新应用,分析了传统分类方法的不足,介绍了一种基于隐含语义索引技术的文本分类改进方案。该技术是一新型的检索模型,它通过奇异值分解,或增强或消减词在文档中的语义影响力,使得文档之间的语义关系更为明晰,从而能容易地剔除掉那些语义关联弱的噪声数据,提高特征值提取精度和最后的分类准确度。

关键词: 隐含语义索引, 奇异值分解, 文本分类, 信息抽取