- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Cache and branch prediction improvements for advanced...
Open Collections
UBC Theses and Dissertations
UBC Theses and Dissertations
Cache and branch prediction improvements for advanced computer architecture Chu, Yul
Abstract
As the gap between memory and processor performance continues to grow, more and more programs will be limited in performance: by the memory latency of the system and by the branch instructions (control flow of the programs). Meanwhile, due to the increase in complexity of application programs over the last decade, object-oriented languages are replacing traditional languages because of convenient code reusability and maintainability. However, it has also been observed that the run-time performance of object-oriented programs can be improved by reducing the impact caused by the memory latency, branch misprediction, and several other factors. In this thesis, two new schemes are introduced for reducing the memory latency and branch mispredictions for High Performance Computing (HPC). For the first scheme, in order to reduce the memory latency, this thesis presents a new cache scheme called TAC (Thrashing-Avoidance Cache), which can effectively reduce instruction cache misses caused by procedure call/returns. The TAC scheme employs N-way banks and XOR mapping functions. The main function of the TAC is to place a group of instructions separated by a call instruction into a bank according to the initial and final bank selection mechanisms. After the initial bank selection mechanism selects a bank on an instruction cache miss, the final bank selection mechanism will determine the final bank for updating a cache line as a correction mechanism. These two mechanisms can guarantee that recent groups of instructions exist in each bank safely. A simulation program, TACSim, has been developed by using Shade and Spixtools, provided by SUN Microsystems, on an ultra SPARC/10 processor. Our experimental results show that TAC schemes reduce conflict misses more effectively than skewed-associative caches in both C (9.29% improvement) and C++ (44.44% improvement) programs on LI caches. In addition, TAC schemes also allow for a significant miss reduction on Branch Target Buffers (BTB). For the second scheme to reduce branch mispredictions, this thesis also presents a new hybrid branch predictor called the GoStay2 that can effectively reduce misprediction rates for indirect branches. The GoStay2 has two different mechanisms compared to other 2-stage hybrid predictors that use a Branch Target Buffer (BTB) as the first stage predictor: First, to reduce conflict misses in the first stage, an effective 2-way cache scheme is used instead of a 4-way set-associative scheme. Second, to reduce mispredictions caused by an inefficient predict and update rule, a new selection mechanism and update rule are proposed. A simulation program, GoS-Sim, has been developed by using Shade and Spixtools, provided by SUN Microsystems, on an Ultra SPARC/10 processor. Our results show significant improvement with these mechanisms compared to other hybrid predictors. For example, the GoStay2 improves indirect misprediction rates of a 64-entry to 4K-entry BTB (with a 512- or lK-entry PHT) by 14.9% to 21.53% compared to the Cascaded predictor (with leaky filter).
Item Metadata
Title |
Cache and branch prediction improvements for advanced computer architecture
|
Creator | |
Publisher |
University of British Columbia
|
Date Issued |
2001
|
Description |
As the gap between memory and processor performance continues to grow, more and more
programs will be limited in performance: by the memory latency of the system and by the branch
instructions (control flow of the programs). Meanwhile, due to the increase in complexity of
application programs over the last decade, object-oriented languages are replacing traditional
languages because of convenient code reusability and maintainability. However, it has also been
observed that the run-time performance of object-oriented programs can be improved by
reducing the impact caused by the memory latency, branch misprediction, and several other
factors. In this thesis, two new schemes are introduced for reducing the memory latency and
branch mispredictions for High Performance Computing (HPC).
For the first scheme, in order to reduce the memory latency, this thesis presents a new cache
scheme called TAC (Thrashing-Avoidance Cache), which can effectively reduce instruction
cache misses caused by procedure call/returns. The TAC scheme employs N-way banks and
XOR mapping functions. The main function of the TAC is to place a group of instructions
separated by a call instruction into a bank according to the initial and final bank selection
mechanisms. After the initial bank selection mechanism selects a bank on an instruction cache
miss, the final bank selection mechanism will determine the final bank for updating a cache line
as a correction mechanism. These two mechanisms can guarantee that recent groups of
instructions exist in each bank safely. A simulation program, TACSim, has been developed by
using Shade and Spixtools, provided by SUN Microsystems, on an ultra SPARC/10 processor.
Our experimental results show that TAC schemes reduce conflict misses more effectively than
skewed-associative caches in both C (9.29% improvement) and C++ (44.44% improvement) programs on LI caches. In addition, TAC schemes also allow for a significant miss
reduction on Branch Target Buffers (BTB).
For the second scheme to reduce branch mispredictions, this thesis also presents a new
hybrid branch predictor called the GoStay2 that can effectively reduce misprediction
rates for indirect branches. The GoStay2 has two different mechanisms compared to other
2-stage hybrid predictors that use a Branch Target Buffer (BTB) as the first stage
predictor: First, to reduce conflict misses in the first stage, an effective 2-way cache
scheme is used instead of a 4-way set-associative scheme. Second, to reduce
mispredictions caused by an inefficient predict and update rule, a new selection
mechanism and update rule are proposed. A simulation program, GoS-Sim, has been
developed by using Shade and Spixtools, provided by SUN Microsystems, on an Ultra
SPARC/10 processor. Our results show significant improvement with these mechanisms
compared to other hybrid predictors. For example, the GoStay2 improves indirect
misprediction rates of a 64-entry to 4K-entry BTB (with a 512- or lK-entry PHT) by
14.9% to 21.53% compared to the Cascaded predictor (with leaky filter).
|
Extent |
7190701 bytes
|
Genre | |
Type | |
File Format |
application/pdf
|
Language |
eng
|
Date Available |
2009-10-07
|
Provider |
Vancouver : University of British Columbia Library
|
Rights |
For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use.
|
DOI |
10.14288/1.0065352
|
URI | |
Degree | |
Program | |
Affiliation | |
Degree Grantor |
University of British Columbia
|
Graduation Date |
2001-05
|
Campus | |
Scholarly Level |
Graduate
|
Aggregated Source Repository |
DSpace
|
Item Media
Item Citations and Data
Rights
For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use.