计算机工程与应用 ›› 2011, Vol. 47 ›› Issue (18): 68-72.

• 网络、通信、安全 • 上一篇    下一篇

面向移动终端的URL过滤方法

刘 夏1,2,3,刘 萍1,2,刘燕兵1,2,谭建龙1,2   

  1. 1.中国科学院 计算技术研究所,北京 100190
    2.信息安全技术国家工程实验室,北京 100190
    3.中国科学院 研究生院,北京 100049
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2011-06-21 发布日期:2011-06-21

URL-filtering method for mobile terminals

LIU Xia1,2,3,LIU Ping1,2,LIU Yanbing1,2,TAN Jianlong1,2   

  1. 1.Institute of Computing Technology,Chinese Academy of Sciences,Beijing 100190,China
    2.National Engineering Laboratory for Information Security Technologies,Beijing 100190,China
    3.Graduate School of Chinese Academy of Sciences,Beijing 100049,China
  • Received:1900-01-01 Revised:1900-01-01 Online:2011-06-21 Published:2011-06-21

摘要: 在移动终端内容安全检测中,“黑名单”过滤是一种常用的手段,但有限的存储空间制约了它的应用。根据“黑名单”过滤特点研究了一种多串匹配算法的改进,以Aho-Corasick算法为例,采用两种启发式策略从不等长的URL串中提取具有代表性的、等长的模式子串,并使用双数组进一步压缩。在Nokia 5230上的测试表明,该算法的存储空间是经典AC算法的0.7%,而速度可达到95%以上。

关键词: 移动终端, Aho-Corasick算法, 空间压缩, 启发式策略, URL过滤

Abstract: In the mobile terminals’ content security monitoring,blacklist is a common way,but the limited memory restrains its application.Based on the characteristics of blacklist filtering,this paper comes up with an improvement for multiple pattern string matching algorithms.This paper uses Aho-Corasick(AC) as an example.Two heuristic strategies are applied to extract representative pattern strings of the same length from URL strings which have different lengths.Then this structure is further compressed using double-array.The experiments in Nakia 5230 show that the algorithm only consumes 0.7% of the memory compared to traditional AC,while maintaining 95% of the speed of latter.

Key words: mobile terminal, Aho-Corasick(AC), memory compress, heuristic strategy, URL filtering