计算机工程与应用 ›› 2007, Vol. 43 ›› Issue (14): 192-194.
• 工程与应用 • 上一篇 下一篇
朱学昊 王儒敬
收稿日期:
修回日期:
出版日期:
发布日期:
通讯作者:
Received:
Revised:
Online:
Published:
摘要: 本文介绍了一种信息抽取和自动分类的新应用,分析了传统分类方法的不足,介绍了一种基于隐含语义索引技术的文本分类改进方案。该技术是一新型的检索模型,它通过奇异值分解,或增强或消减词在文档中的语义影响力,使得文档之间的语义关系更为明晰,从而能容易地剔除掉那些语义关联弱的噪声数据,提高特征值提取精度和最后的分类准确度。
关键词: 隐含语义索引, 奇异值分解, 文本分类, 信息抽取
Abstract: This paper presents a new implementation of information retrieval and automatic classification. In order to overcome the shortage of traditional methods, an improved classification based on latent semantic indexing is introduced. LSI is a new retrieval model based on Singular Value Decomposition (SVD). Using the algorithm, every term will be either strengthened or weakened. When the latent semantic becomes clearer, it is easy to cut off most of the noisy data at the very beginning. So the accuracy of classification will be improved.
Key words: Latent Semantic Indexing, Singular Value Decomposition, Text Classification, Information Retrieval
朱学昊 王儒敬. 隐含语义索引技术在供求信息分类中的应用[J]. 计算机工程与应用, 2007, 43(14): 192-194.
0 / 推荐
导出引用管理器 EndNote|Ris|BibTeX
链接本文: http://cea.ceaj.org/CN/
http://cea.ceaj.org/CN/Y2007/V43/I14/192