- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Error resilience evaluation on GPGPU applications
Open Collections
UBC Theses and Dissertations
UBC Theses and Dissertations
Error resilience evaluation on GPGPU applications Fang, Bo
Abstract
While graphics processing units (GPUs) have gained wide adoption as accelerators for general-purpose applications (GPGPU), the end-to-end reliability implications of their use have not been quantified. Fault injection is a widely used method for evaluating the reliability of applications. However, building a fault injector for GPGPU applications is challenging due to their massive parallelism, which makes it difficult to achieve representativeness while being time-efficient. This thesis makes three key contributions. First, it presents the design of a fault-injection methodology to evaluate the end-to-end reliability properties of application kernels running on GPUs. Second, it introduces a fault-injection tool that uses real GPU hardware and offers a good balance between the representativeness and the efficiency of the fault injection experiments. Third, it characterizes the error resilience characteristics of twelve GPGPU applications. Last but not least, this thesis provides preliminary insights on correlations between algorithm properties and the measured silent data corruption rates of applications.
Item Metadata
Title |
Error resilience evaluation on GPGPU applications
|
Creator | |
Publisher |
University of British Columbia
|
Date Issued |
2014
|
Description |
While graphics processing units (GPUs) have gained wide adoption as accelerators for general-purpose applications (GPGPU), the end-to-end reliability implications of their use have not been quantified. Fault injection is a widely used method for evaluating the reliability of applications. However, building a fault injector for GPGPU applications is challenging due to their massive parallelism, which makes it difficult to achieve representativeness while being time-efficient.
This thesis makes three key contributions. First, it presents the design of a fault-injection methodology to evaluate the end-to-end reliability properties of application kernels running on GPUs. Second, it introduces a fault-injection tool that uses real GPU hardware and offers a good balance between the representativeness and the efficiency of the fault injection experiments. Third, it characterizes the error resilience characteristics of twelve GPGPU applications. Last but not least, this thesis provides preliminary insights on correlations between algorithm properties and the measured silent data corruption rates
of applications.
|
Genre | |
Type | |
Language |
eng
|
Date Available |
2014-08-11
|
Provider |
Vancouver : University of British Columbia Library
|
Rights |
Attribution-NonCommercial-NoDerivs 2.5 Canada
|
DOI |
10.14288/1.0165934
|
URI | |
Degree | |
Program | |
Affiliation | |
Degree Grantor |
University of British Columbia
|
Graduation Date |
2014-09
|
Campus | |
Scholarly Level |
Graduate
|
Rights URI | |
Aggregated Source Repository |
DSpace
|
Item Media
Item Citations and Data
Rights
Attribution-NonCommercial-NoDerivs 2.5 Canada