计算机工程与应用 ›› 2014, Vol. 50 ›› Issue (21): 152-156.

• 数据库、数据挖掘、机器学习 • 上一篇    下一篇

基于动态权值的多策略领域本体概念自动抽取

张华楠1,刘胜全2,刘  艳1,刘华鹏1,李  鹏1   

  1. 1.新疆大学 信息科学与工程学院,乌鲁木齐 830046
    2.新疆大学 现代教育技术中心,乌鲁木齐 830046
  • 出版日期:2014-11-01 发布日期:2014-10-28

Automatic extraction method of domain ontology concepts based on dynamic weight multi-strategy

ZHANG Huanan1, LIU Shengquan2, LIU Yan1, LIU Huapeng1, LI Peng1   

  1. 1.School of Information Science and Engineering, Xinjiang University, Urumqi 830046, China
    2.Modern Educational Technology Center, Xinjiang University, Urumqi 830046, China
  • Online:2014-11-01 Published:2014-10-28

摘要: 为了提高中文领域本体概念抽取的自动化程度及准确率,提出了一种基于动态权值的多策略中文领域本体概念自动抽取方法。针对中文领域本体概念的特点,采用自动学习的规则学习模式,筛选出候选概念,将改进的DR&DC、TF-IDF和NC-Value三种策略融合,对候选概念进行领域归属度排序,将最终权重超过阈值的概念存入最终概念集合。实验证明了该方法抽取领域概念的可行性和有效性。

关键词: 动态权值, 本体学习, 多策略, 概念抽取

Abstract: To improve the automation degree and accuracy of Chinese domain ontology concept extraction, a method of concepts automatic extraction based on dynamic weighted multi-strategy integration is proposed. This paper filters out the candidate concepts according to the rule templates using automatic learning; and then improved DR&DC, TF-IDF and NC-Value are integrated; it sequences the degree of domain membership of the candidate concept sets, and puts concepts whose weight exceeds the threshold value into final concept sets. After lots of experiments, the feasibility and validity of this method are proved.

Key words: dynamic weight, ontology learning, multi-strategy, concept extraction