Adaptive Failure Prediction for Computer Systems: a Framework and a Case Study



Failure Prediction allows improving the dependability of computer systems, but its use is still uncommon due to scarcity of failure-related data that can be used for training, assessing and comparing alternative failure predictors. As failures are rare events and the characteristics of failure data varies from system to system, in this paper we propose the use of realistic software fault injection to facilitate the generation of failure data on a particular system installation. In practice, we propose a comprehensive experimental approach that allows generating failure data in short time and we study the applicability and limitations of such process in assessing and comparing alternative failure prediction algorithms. A case study is presented comparing four algorithms for predicting failures in a system based on a Windows OS. Results show that using fault injection allows to dramatically speed up the generation of failure data and that the proposed procedure can be used in practice.


Dependability, self-adaptive systems, online failure prediction, software fault injection


Adaptive online failure prediction


HASE 2015 - 16th IEEE International Symposium on High Assurance Systems Engineering , January 2015

