Computer Engineering and Applications ›› 2021, Vol. 57 ›› Issue (7): 88-94.DOI: 10.3778/j.issn.1002-8331.2003-0386

Previous Articles     Next Articles

Adaptive Early Warning Method for Streaming Big Data Events Based on Two-Stage Regression

ZHAO Linsuo, MA Ruiqiang, JIANG Tian, SONG Baoyan , PAN Yishan   

  1. 1.College of Mechanics and Engineering, College of Mechanics and Engineering, Fuxin, Liaoning 123000, China
    2.School of Information, Liaoning University, Shenyang 110036, China
  • Online:2021-04-01 Published:2021-04-02

两级回归的流式大数据事件自适应预警方法

赵林锁,马瑞强,姜天,宋宝燕,潘一山   

  1. 1.辽宁工程技术大学 力学与工程学院,辽宁 阜新 123000
    2.辽宁大学 信息学院,沈阳 110036

Abstract:

Affected by factors such as external environment of collector deployment and human interference, streaming data has drift characteristics. Simultaneously, the occurrence of streaming data events has no fixed rule and random characteristics, which leads to the low accuracy of existing methods for identifying data stream events, and the result of identifying cannot be obtained before the event is completely completed, the identification is lag behind. In order to solve these problems, this paper proposes an adaptive two-stage regression method for real-time event identification of large data streams. Firstly, based on massive historical disaster events, this method introduces the first-order moving regression method to establish weight support region and extract event data feature points. Secondly, the second-order linear regression method is used to set up event model, the least square error analysis is performed on the model, and then constructs the event identification domain. Finally, this paper proposes a step-by-step real-time identification method for streaming data events, introduces confidence factor concept based on event identification domain, estimates the future development trend of events through self-adaptive transformation strategy of confidence factor and realizes real-time event identification. Experiments show that the proposed method has great advantages in the efficiency and accuracy of event identification.

Key words: moving regression method, linear regression method, confidence factor, adaptive transformation, real-time early warning

摘要:

流式数据事件具有时间持续性,受采集器频率及外部环境干扰等因素影响,流式数据具有规模大、数据漂移等特征,且事件发生具有随机性特点,导致现有流式数据事件预警方法准确性很低,且在事件完全结束前无法得出判识结果,预警具有滞后性。针对这些问题提出一种两级回归的流式大数据事件自适应预警方法。该方法首先基于海量历史灾害事件,引入一级移动回归法建立权重支持域,提取事件的数据特征点,通过二级线性回归法建立事件回归模型,并对模型进行最小二乘误差分析建立事件置信域,构成预警模型;基于事件预警模型提出判识因子概念,提出流式数据事件分阶预警方法,通过判识因子自适应变换策略对事件未来发展趋势进行预估计,实现事件的实时预警。实验结果表明,该方法对比现有方法在事件预警实时性、预警效率及预警准确性等方面均具有很大优势。

关键词: 移动回归法, 线性回归法, 判识因子, 自适应变换, 实时预警