Software-Implemented Stable Storage in Main Memory



Stable storage is a type of memory whose contents must survive system malfunctions. It is a key element for fault tolerant systems to perform checkpointing and rollback-recovery. It is usually implemented using hard disks, because of its non-volatile characteristics and robustness against processor faults. However, for many control systems, the low speed and unpredictability of disks are not acceptable due to real-time constraints or, specially on embedded systems, disks are simply not available. In such cases, main memory should be used.
In this paper we present and test a stable storage mechanism relying exclusively on local main memory, and able to survive processor malfunctions. We show that by a careful use of some very common features of processors, like memory protection, we can obtain very high data survivability to system crashes. The study was conducted on COTS computers and a commercial real-time executive running two sample applications, which were subjected to intensive fault injection campaigns. To the best of our knowledge, this is the first time that software-implemented stable storage in RAM has been presented and tested by fault injection.


Stable Storage, Fault-Injection, COTS


IX Brazilian Symposium on Fault-Tolerant Computing (SCTF'2001), March 2001

