17th Feb 21
SIGDep 2021 Seminars - DEI - UC
Wednesday, February 17 - 14h
Frederico Cerveira (from CISUC, UC) will make the first presentation of SIGDep 2021 edition.
Title: Fault Tolerance for Cloud Computing using Romulus
Abstract: Many organizations are moving their systems to the cloud in search of reduced costs. Nevertheless, the cloud is still viewed with doubt by the organizations that have mission-critical applications with strict dependability requirements, in part due to the fact that computing resources are shared among clients using virtualization. Ultimately, if cloud computing is to be homogeneously adopted, it must provide dependability guarantees similar to those of dedicated infrastructure.
In this talk, I will describe Romulus, a fault tolerance technique for cloud computing and virtualized infrastructure born from the observation that hypervisors fail, often causing common-mode failures that disrupt many virtual machines simultaneously, and the hypothesis that a significant percentage of the affected virtual machines are capable of continuing execution on a new hypervisor. Romulus recovers from hypervisor failures by efficiently migrating all virtual machines from the failed hypervisor to a co-located hypervisor, thus allowing virtual machines to continue executing with minimal downtime. A proof-of-concept implementation was evaluated using fault injection to confirm Romulus' effectiveness.
Frederico Cerveira is a PhD student and researcher at University of Coimbra, Portugal. His PhD topic deals with the evaluation of current cloud computing systems in the presence of hardware and software faults, in order to propose mechanisms capable of increasing the dependability of these systems. His main research interests include dependability, fault injection and fault tolerance, mainly in the context of virtualized and cloud computing systems.