Computer Engineering and Applications ›› 2017, Vol. 53 ›› Issue (3): 68-73.DOI: 10.3778/j.issn.1002-8331.1605-0057

Previous Articles     Next Articles

Improvement of multi-patterns string matching algorithms of Advanced AC

FAN Hongbo, SHI Shupeng, ZHANG Jing   

  1. School of Information Engineering & Automation, Kunming University of Science & Technology, Kunming 650504, China
  • Online:2017-02-01 Published:2017-05-11

改进的AAC多模式实时匹配算法

范洪博,史舒鹏,张  晶   

  1. 昆明理工大学 信息工程与自动化学院,昆明 650504

Abstract: The AAC algorithm has high and stable performance. It is the most widely used multi-patterns string matching algorithm. In AAC, to determine whether the automaton reaches the final state(means a matching found), the output table should be visited per character read, which produces high cost. To reduce the cost of output table visited, this paper improves AAC by two methods. One method is copying final states, adding them after the original automaton and modifying the transitions whose target state is final to the additive state. The other method is modifying the transitions whose target state is final and making the transition store the negative value of target state. Thereby, two algorithms are presented named Advanced AC with Additive state(AACA)and Advanced AC with Negative state(AACN). These two algorithms can determine whether the automaton reaches the final state only by the value of the target state of transition and the visit of output table reduces from per character read to per matching occurrence. The experiment result indicates that AACA and AACN are faster than AAC, especially in small scale string matching. In the mean research field of the team, the timing models of Cyber-Physical Systems(CPS), the multiple timing patterns matching in the timing reachability graph of the CPS is the key problem. Since the timing pattern is rapidly changed, the real-time of matching is more important while the scale of matching is not big. AACA and AACN fit this area which can increase the accuracy and real-time of hybrid embedded system timing models.

Key words: Advanced AC(AAC) algorithm, multi-patterns, automata, pattern matching

摘要: AAC算法(Advanced AC)是使用最为广泛的多模式串匹配算法,匹配性能高,匹配时间稳定。针对AAC算法为判定转移目标状态是否为终结状态,在匹配时每读入一个字符都要访问output表,代价较高的问题,通过两种方法改进了AAC算法。第一种方法为拷贝自动机中的终结状态,将其附加在AAC自动机后,并将原自动机中指向终结状态的转移目标修改为附加状态,直接根据转移目标位置判断当前状态是否是终结状态,从而提出Advanced AC with Additive state(AACA)算法。第二种改进方法为将自动机中指向终结状态的状态转移值置为负数,根据转移目标的值直接判断目标状态是否为终结状态,从而提出Advanced AC with Negative state(AACN)算法。以上两种改进算法只有在发现模式匹配时才需进行output表的访问。实验数据表明:AACA和AACN算法性能均高于AAC算法,特别在中小规模匹配上,性能提升更为明显。

关键词: 改进的AC(AAC)算法, 多模式, 自动机, 模式匹配