Computer Engineering and Applications ›› 2015, Vol. 51 ›› Issue (16): 42-46.

Previous Articles     Next Articles

Fault detection technique of self-healing operating system based on dynamic tracing

SHI Jialong1, ZHU Yi’an1, LU Wei2, CHAI Ruiya1   

  1. 1.School of Computer Science, Northwestern Polytechnical University, Xi’an 710072, China
    2.School of Software and Microelectronics, Northwestern Polytechnical University, Xi’an 710072, China
  • Online:2015-08-15 Published:2015-08-14

基于动态追踪的自愈操作系统故障监测技术

史佳龙1,朱怡安1,陆  伟2,柴瑞亚1   

  1. 1.西北工业大学 计算机学院,西安 710072
    2.西北工业大学 软件与微电子学院,西安 710072

Abstract: Code segments about dynamic memory allocation and resource preemption operations are the major source of operating system faults. This paper provides a new fault detection technique based on dynamic kernel tracing, by collecting information about the call stack of kernel and global state transition, the fault source and type can be specified. The fault detection technique is implemented as loadable kernel module in Linux, which can collect kernel information effectively without additional hardware and modification of original system code. Results of fault injection experiments can prove that the proposed technique can detect faults effectively and the detection delay is smaller than former methods based on time-out and performance metrics.

Key words: self-healing operating system, fault detection, dynamic kernel tracing

摘要: 操作系统内核故障往往集中分布在特定位置,其中动态内存分配和资源竞争相关代码段为典型的故障集中点,针对上述两类故障集中点,提出了一种新的基于内核动态追踪的故障监测技术,通过追踪导致内核全局数据状态迁移的方法调用,依据设计的规则对记录的调用序列和数据进行分析,实现对故障的监测和定位。监测技术在Linux操作系统中以可加载内核模块的形式实现,不需要额外硬件支持和对原系统代码进行修改。通过故障注入实验验证了监测技术的有效性,监测延时低于已有的基于时间和系统性能指标的故障监测技术。

关键词: 自愈操作系统, 故障监测, 内核动态追踪