Using idle workstations to implement parallel prefetch prediction

UBC Theses and Dissertations

Featured Collection

UBC Theses and Dissertations

Using idle workstations to implement parallel prefetch prediction Wang, Jasmine Yongqi

Abstract

The benefits of prefetching have been largely overshadowed by the overhead required to produce high quality predictions. Although theoretical and simulation-based results for prediction algorithms such as Prediction by Partial Matching (PPM) appear promising, practical results have thus far been disappointing. This outcome can be attributed to the fact that practical implementations ultimately make compromises in order to reduce overhead. These compromises limit the level of complexity, variety, and granularity used in the policies and mechanisms the implementation supports. This thesis examines the use of idle workstations to implement prediction-based prefetching. We propose a novel framework to leverage the resources in a system area network to reduce I/O stall time by prefetching non-resident pages into a target node's memory by an idle node. This configuration allows prediction to run in parallel with a target application. We have implemented a revised version of the GMS global memory system, called GMS-3P, that provides parallel prediction-based prefetching. We discuss the different algorithms we have chosen and the policies and mechanisms used to control the quality of predictions. We have also implemented a low overhead mechanism to communicate the history fault trace between the active node and the prediction node. This thesis also explores the needs of programs which have access patterns that cannot be captured by a single configuration of PPM. The dilemma associated with conventional prediction mechanisms that attempt to accommodate this behaviour is that varying the configuration adds overhead to the prediction mechanism. By moving the prediction mechanism to an idle node, we were able to add this functionality without compromising performance on the application node. Our results show that for some applications in GMS-3P, the I/O stall time can be reduced as much as 77%, while introducing an overhead of 4-8% on the node actively running the application.

Item Metadata

Title	Using idle workstations to implement parallel prefetch prediction
Creator	Wang, Jasmine Yongqi
Publisher	University of British Columbia
Date Issued	1999
Description	The benefits of prefetching have been largely overshadowed by the overhead required to produce high quality predictions. Although theoretical and simulation-based results for prediction algorithms such as Prediction by Partial Matching (PPM) appear promising, practical results have thus far been disappointing. This outcome can be attributed to the fact that practical implementations ultimately make compromises in order to reduce overhead. These compromises limit the level of complexity, variety, and granularity used in the policies and mechanisms the implementation supports. This thesis examines the use of idle workstations to implement prediction-based prefetching. We propose a novel framework to leverage the resources in a system area network to reduce I/O stall time by prefetching non-resident pages into a target node's memory by an idle node. This configuration allows prediction to run in parallel with a target application. We have implemented a revised version of the GMS global memory system, called GMS-3P, that provides parallel prediction-based prefetching. We discuss the different algorithms we have chosen and the policies and mechanisms used to control the quality of predictions. We have also implemented a low overhead mechanism to communicate the history fault trace between the active node and the prediction node. This thesis also explores the needs of programs which have access patterns that cannot be captured by a single configuration of PPM. The dilemma associated with conventional prediction mechanisms that attempt to accommodate this behaviour is that varying the configuration adds overhead to the prediction mechanism. By moving the prediction mechanism to an idle node, we were able to add this functionality without compromising performance on the application node. Our results show that for some applications in GMS-3P, the I/O stall time can be reduced as much as 77%, while introducing an overhead of 4-8% on the node actively running the application.
Extent	5042190 bytes
Genre	Thesis/Dissertation
Type	Text
File Format	application/pdf
Language	eng
Date Available	2009-06-29
Provider	Vancouver : University of British Columbia Library
Rights	For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use.
DOI	10.14288/1.0051684
URI	http://hdl.handle.net/2429/9820
Degree (Theses)	Master of Science - MSc
Program (Theses)	Computer Science
Affiliation	Science, Faculty of; Computer Science, Department of
Degree Grantor	University of British Columbia
Graduation Date	1999-11
Campus	UBCV
Scholarly Level	Graduate
Aggregated Source Repository	DSpace

Item Media

ubc_1999-0641.pdf -- 4.81MB

Item Citations and Data

Rights

For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use.

Open Collections

UBC Theses and Dissertations

Using idle workstations to implement parallel prefetch prediction Wang, Jasmine Yongqi

Abstract

Item Metadata

Item Media

Item Citations and Data

Rights