计算机工程与应用 ›› 2016, Vol. 52 ›› Issue (15): 93-96.

• 大数据与云计算 • 上一篇    下一篇

数据归约效果评估方法研究

康睿智,郝文宁   

  1. 解放军理工大学 指挥信息系统学院,南京 210007
  • 出版日期:2016-08-01 发布日期:2016-08-12

Method about evaluating effect of data reduction

KANG Ruizhi, HAO Wenning   

  1. Institute of Command Information System, PLA University of Science and Technology, Nanjing 210007, China
  • Online:2016-08-01 Published:2016-08-12

摘要: 数据归约效果的评估结果反映了归约后数据集的质量,同时也是相关算法及归约流程的选择、优化的依据。针对目前数据归约效果评估指标体系不完善、指标适用性弱以及效果评估方法缺乏针对性等问题,研究提出能够综合反映数据集归约前后的平均信息量减少程度、统计特征差异程度与数据量减小程度等三个方面的评估指标及其计算方法,上述指标可为数据归约方案的效果评估提供定量依据。

关键词: 数据归约, 评估指标, 信息量, 统计特征, 数据量

Abstract: The result of data reduction effect evaluation not only reflects the quality of the reduced data set, but also provides the basis for selecting and optimizing the reduction process including the relevant algorithm. Currently, in view of the deficiencies of the data reduction effect evaluation system, such as weak indicators of the suitability and effectiveness of targeted issues, three indicators are proposed: the first indicator reflects the average amount of information in data set before and after the reduction in the degree of reduction; the second indicator reflects the variation in the degree of statistical feature; the third one reflects the variation about the amount of data. These three indicators can provide a quantitative basis for assessing the effect of data reduction quantificationally.

Key words: data reduction, evaluation indicators, information amount, statistical features, data amount