UBC Theses and Dissertations
A distributed snapshot protocol for virtual machines Peng, Gang
The distributed snapshot protocol is a critical technology in the areas of disaster recovery and computer security of distributed systems, and there have appeared a huge number of projects working on this topic since the 1970's. Recently, with the popularity of parallel computing and disaster recovery, this topic has received more and more attention from both academic and industrial researchers. However, all the existing protocols have several common disadvantages. First, existing protocols all require several modifications to the target processes or their OS, which is usually error prone and sometimes impractical. Second, all the existing protocols are only aiming at taking snapshots of processes, not whole entire OS images, which constrains the areas to which they can be applied. This thesis introduces the design and implementation of our hypervisor level, coordinated non-blocking distributed snapshot protocol. Superior to all the existing protocols, it provides a simpler and totally transparent snapshot platform to both the target processes and their OS images. Based on several observations of the target environment, we simplify our protocol by intentionally ignoring the channel states, and to hide our protocol from the target processes and their OS, we, on one hand, exploit VM technology to silently insert our protocol under the target OS, and on the other hand, design and implement two kernel modules and a management daemon system in the control domain. We test our protocol with several popular benchmarks and all the experimental results prove the correctness and the efficiency of our protocol.
Item Citations and Data