计算机工程与应用 ›› 2021, Vol. 57 ›› Issue (6): 88-93.DOI: 10.3778/j.issn.1002-8331.2008-0105

• 模式识别与人工智能 • 上一篇    下一篇

基于侧重点聚类的数学表达式相似度计算方法

杨芳,尹曦,司建辉,刘宏媛,汪雪   

  1. 河北大学 网络空间安全与计算机学院,河北 保定 071002
  • 出版日期:2021-03-15 发布日期:2021-03-12

Mathematical Expression Similarity Calculation Method Based on Focus Clustering

YANG Fang, YIN Xi, SI Jianhui, LIU Hongyuan, WANG Xue   

  1. School of Cyber Security and Computer, Hebei University, Baoding, Hebei 071002, China
  • Online:2021-03-15 Published:2021-03-12

摘要:

数学表达式相似度计算在信息检索中起着重要的作用,但现有的计算方法较少考虑数学表达式侧重点对相似度计算准确度的影响。为解决该问题,提出一种基于侧重点聚类的数学表达式相似度计算方法。针对侧重点主观性强的特点,定义表达式元素映射规则,使用[K]-means++算法对数学表达式聚类,从而归纳出数学表达式所属侧重点簇;以侧重点簇为依据,使用遗传算法对相似度计算方法中相关参数进行优化调节,以加强侧重点对相似度结果的影响。对比实验表明,该方法的相似度计算性能有所提高,得到的表达式结果列表更为理想。

关键词: 数学表达式, 相似度, 聚类, 参数调节

Abstract:

Mathematical expression similarity calculation plays an important role in information retrieval, but the existing calculation methods seldom consider the impact of mathematical expression focus on similarity calculation accuracy. To solve this problem, a method for calculating the similarity of mathematical expressions based on focus clustering is proposed. Aiming at the strong subjectivity of the focus, expression element mapping rules are defined, and the [K]-means++ algorithm is used to cluster mathematical expressions based on operators, thereby summarizing the focus clusters of mathematical expressions. Based on the focus cluster, genetic algorithm is used to optimize and adjust the related parameters in the similarity calculation method to strengthen the influence of focus on the similarity results. The comparative experiments show that the similarity calculation performance of this method is improved, and the expression result list obtained is more ideal.

Key words: mathematical expressions, similarity, clustering, parameter adjustment