计算机工程与应用 ›› 2016, Vol. 52 ›› Issue (8): 33-37.

• 理论与研发 • 上一篇    下一篇

基于KMP算法的改进算法KMPP

李  莉1,江育娥1,林  劼1,江秉华2   

  1. 1.福建师范大学 软件学院,福州 350100
    2.南京医科大学 病理系,南京 210029
  • 出版日期:2016-04-15 发布日期:2016-04-19

Improved algorithm KMPP based on KMP

LI Li1, JIANG Yu’e1, LIN Jie1, JIANG Binghua2   

  1. 1.Faculty of Software, Fujian Normal University, Fuzhou 350100, China
    2.Department of Pathology, Nanjing Medical University, Nanjing 210029, China
  • Online:2016-04-15 Published:2016-04-19

摘要: KMP算法和BM算法是经典的单模式匹配算法,但KMP算法中文本指针[i]每次只能移动一个字符,整体的匹配效率并不高,结合KMP算法和BM算法的优点提出一种改进算法(KMPP)。算法的思想是模式串与文本在[j]处不匹配时,预算出模式串移动[next[j]]后末字符在文本中的位置,当该位置的文本字符与末字符不匹配时,则用该字符进行坏字符匹配,这两步的跳跃距离就是文本指针[i]移动的距离,从而使指针[i]每次移动的距离达到最大。实验结果表明,该算法匹配次数远低于KMP算法的匹配次数,提高了模式匹配的效率。

关键词: 模式匹配, KMP算法, BM算法, KMPP算法

Abstract: KMP algorithm and BM algorithm are classical single pattern matching algorithms, because the pointer [i] in the text can only move one character at each time in KMP algorithm, the overall matching efficiency is not high, the improved algorithm(KMPP) accordingly to combine the advantages of the KMP algorithm with BM algorithm together is proposed. The core of KMPP algorithm is when a mismatch occurs at position [j], it calculates the position in the text where the last character of pattern will align with after the pattern moving the value of [next[j]]. If the last character of pattern does not match with the corresponding text position, the bad character rules are applied. The moving distance of pointer [i] in the text is a two-step jumping distance, thus the pointer can move farthest at each time. The experimental result shows that the comparison number of KMPP algorithm is far less than the KMP algorithm’s number and it improves the efficiency of pattern matching algorithm.

Key words: pattern matching, KMP algorithm, BM algorithm, KMPP algorithm