UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Using idle workstations to implement parallel prefetch prediction Wang, Jasmine Yongqi

Abstract

The benefits of prefetching have been largely overshadowed by the overhead required to produce high quality predictions. Although theoretical and simulation-based results for prediction algorithms such as Prediction by Partial Matching (PPM) appear promising, practical results have thus far been disappointing. This outcome can be attributed to the fact that practical implementations ultimately make compromises in order to reduce overhead. These compromises limit the level of complexity, variety, and granularity used in the policies and mechanisms the implementation supports. This thesis examines the use of idle workstations to implement prediction-based prefetching. We propose a novel framework to leverage the resources in a system area network to reduce I/O stall time by prefetching non-resident pages into a target node's memory by an idle node. This configuration allows prediction to run in parallel with a target application. We have implemented a revised version of the GMS global memory system, called GMS-3P, that provides parallel prediction-based prefetching. We discuss the different algorithms we have chosen and the policies and mechanisms used to control the quality of predictions. We have also implemented a low overhead mechanism to communicate the history fault trace between the active node and the prediction node. This thesis also explores the needs of programs which have access patterns that cannot be captured by a single configuration of PPM. The dilemma associated with conventional prediction mechanisms that attempt to accommodate this behaviour is that varying the configuration adds overhead to the prediction mechanism. By moving the prediction mechanism to an idle node, we were able to add this functionality without compromising performance on the application node. Our results show that for some applications in GMS-3P, the I/O stall time can be reduced as much as 77%, while introducing an overhead of 4-8% on the node actively running the application.

Item Media

Item Citations and Data

Rights

For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use.