Approach for mining associative terms in uncategorized English documents set

doi:10.3778/j.issn.1002-8331.2009.05.044

Computer Engineering and Applications ›› 2009, Vol. 45 ›› Issue (5): 151-153.DOI: 10.3778/j.issn.1002-8331.2009.05.044

• 数据库、信号与信息处理 • Previous Articles Next Articles

Approach for mining associative terms in uncategorized English documents set

FU Zhong-kai,QIN Hua

College of Computer Science and Technology，Beijing University of Technology，Beijing 100022，China

Received:2008-01-10 Revised:2008-04-14 Online:2009-02-11 Published:2009-02-11
Contact: FU Zhong-kai

在未分类英文文档集中挖掘相关词的方法

付仲恺,秦华

北京工业大学计算机学院，北京，100022

通讯作者: 付仲恺

Abstract

Abstract: In the improvement of search engine result，voices recognize fields，how to analyze the relationship between two words exactly is a key point.To analyze and solve this problem，some experiment conclusions are proposed by statistics of frequency of terms and concurrency terms on the basis of considerable English web pages.According to the conclusions，an approach is addressed to calculate ranks of associative terms and a distributed proto-type system is implemented.

Key words: data mining, web-page classification, association rules, sort algorithm, text representation

摘要： 在搜索引擎结果相关性判断、文字语音转换与识别等领域中，如何准确地分析单词之间的搭配关系是主要研究问题之一。利用互联网中的海量信息，在对大量英文网页进行统计分析的基础上，利用单词的出现频率和单词对的共现频率归纳总结出了未分类互联网页面中单词相关程度判定的经验性结论，提出了一种基于文档集统计分析的单词相关程度排序方法和计算公式，并根据该方法实现了分布式的英文单词相关性挖掘系统的原型。

关键词: 数据挖掘, 网页分类, 关联规则, 排序算法, 文本表示

FU Zhong-kai,QIN Hua. Approach for mining associative terms in uncategorized English documents set[J]. Computer Engineering and Applications, 2009, 45(5): 151-153.

付仲恺,秦华. 在未分类英文文档集中挖掘相关词的方法[J]. 计算机工程与应用, 2009, 45(5): 151-153.

[1]	ZONG Xiaoping, TAO Zeze. Knowledge Tracing Model Based on Mastery Speed [J]. Computer Engineering and Applications, 2021, 57(6): 117-123.
[2]	GAO Tianyu, WANG Qingrong, YANG Lei. Data Mining Model Based on Attribute Dependability Enhancement of Rough Set [J]. Computer Engineering and Applications, 2021, 57(3): 87-93.
[3]	MA Yang, ZHAO Xujun. Multi-source Outlier Detection Algorithm Based on Relevant Subspace [J]. Computer Engineering and Applications, 2021, 57(17): 88-95.
[4]	ZHANG Nianpeng, WU Xu, ZHU Qiang. Entropy-Based Oversampling Framework [J]. Computer Engineering and Applications, 2021, 57(13): 96-101.
[5]	ZHANG Bowen, LIU Zhi, SANG Guoming. Anomaly Detection Algorithm Based on Kernel Density Fluctuation [J]. Computer Engineering and Applications, 2021, 57(12): 132-136.
[6]	ZHANG Zhenhai，ZHANG Xiangting. Context-Aware Information Service Recommendation Method for High-Speed Rail [J]. Computer Engineering and Applications, 2021, 57(12): 231-236.
[7]	RAO Jiawang, MA Ronghua. Improved Kernel Density Estimator Based Spatial Point Density Algorithm [J]. Computer Engineering and Applications, 2021, 57(11): 260-265.
[8]	YANG Geying, SHEN Xiajiong, SHI Xianjin, ZHANG Lei. Visualization of Association Rules in Context of Concept Lattices [J]. Computer Engineering and Applications, 2021, 57(1): 84-91.
[9]	WANG Jie, CHEN Zhigang, LIU Jialing, CHENG Hongbing. Privacy Behavior Mining Technology for Cloud Computing Based on Clustering [J]. Computer Engineering and Applications, 2020, 56(5): 80-84.
[10]	WANG Zilong, LI Jin, SONG Yafei. Improved K-means Algorithm Based on Distance and Weight [J]. Computer Engineering and Applications, 2020, 56(23): 87-94.
[11]	YI Junyan, WU Boya, YONG Qiaoling. Research on Clustering Algorithm of Elastic Net with Weighted Characteristics [J]. Computer Engineering and Applications, 2020, 56(22): 55-65.
[12]	JI Wenlu, WANG Hailong, SU Guibin, LIU Lin. Review of Recommendation Methods Based on Association Rules Algorithm [J]. Computer Engineering and Applications, 2020, 56(22): 33-41.
[13]	GU Junhua, SU Ming, ZHANG Yajuan, ZHANG Danhong. Research on Fast Frequent Pattern Mining Algorithm Based on Bitmap-Code List [J]. Computer Engineering and Applications, 2020, 56(19): 86-93.
[14]	LIU Wenfen, MU Xiaodong, HUANG Yuehua. Anomaly Detection Method Based on Multi-resolution Grid [J]. Computer Engineering and Applications, 2020, 56(17): 78-85.
[15]	MENG Haidong1，2, SUN Xinjun2, SONG Yuchen1. Improved LOF Algorithm Based on Data Field [J]. Computer Engineering and Applications, 2019, 55(3): 154-158.

Approach for mining associative terms in uncategorized English documents set

在未分类英文文档集中挖掘相关词的方法

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics