- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Towards understanding and mitigating memory-access...
Open Collections
UBC Theses and Dissertations
UBC Theses and Dissertations
Towards understanding and mitigating memory-access challenges in computing systems Dashti, Mohammad K.
Abstract
In an era of hardware diversity, the management of applications' allocated memory is a complex task that can have significant performance repercussions. Non-uniformity in the memory hierarchy, along with heterogeneity and asymmetry of chip designs, make the costs of memory accesses unpredictable if the allocated memory is not managed carefully. Poor memory allocation and placement on symmetric, non-uniform memory access (NUMA) server systems can cause interconnect link congestion and memory controller contention, which can drastically impact performance. Furthermore, asymmetric systems consisting of CPU and GPU cores suffer similarly depending on how memory is allocated and what types of cores are accessing it. Furthermore, modern chip designs are integrating compute units into the memory modules to bring compute closer to memory and eliminate the high costs of transferring data. Hence, accessing memory efficiently is becoming increasingly challenging as systems evolve toward heterogeneity. Our contribution is a detailed analysis and insight into software and hardware memory management techniques on modern symmetric server-class and asymmetric heterogeneous processors consisting of integrated CPU and GPU cores, and to recently commercially available Processing In Memory (PIM) systems. In particular, we examine three types of systems that are affected by memory access bottlenecks. First, we look at NUMA systems, then integrated CPU-GPU systems, and finally PIM systems. We analyze these systems and suggest possible solutions to mitigate memory access issues. For NUMA systems, we propose a holistic memory management algorithm that intelligently distributes memory pages to reduce congestion and improve performance. Then, we provide a detailed analysis of memory management methods on integrated CPU-GPU systems focusing on performance and functionality trade-offs. Our goal is to expose the performance impact of memory management techniques and assess the viability and advantage of running applications with complex behaviors on such integrated CPU-GPU systems. Finally, we examine PIM systems with a case study of image decoding, which is a critical stage for many deep learning applications. We show that the extreme compute scalability of PIM systems can be utilized to accelerate image decoding with performance potential that can surpass CPUs and GPUs.
Item Metadata
Title |
Towards understanding and mitigating memory-access challenges in computing systems
|
Creator | |
Supervisor | |
Publisher |
University of British Columbia
|
Date Issued |
2022
|
Description |
In an era of hardware diversity, the management of applications' allocated memory is a complex task that can have significant performance repercussions. Non-uniformity in the memory hierarchy, along with heterogeneity and asymmetry of chip designs, make the costs of memory accesses unpredictable if the allocated memory is not managed carefully.
Poor memory allocation and placement on symmetric, non-uniform memory access (NUMA) server systems can cause interconnect link congestion and memory controller contention, which can drastically impact performance. Furthermore, asymmetric systems consisting of CPU and GPU cores suffer similarly depending on how memory is allocated and what types of cores are accessing it. Furthermore, modern chip designs are integrating compute units into the memory modules to bring compute closer to memory and eliminate the high costs of transferring data. Hence, accessing memory efficiently is becoming increasingly challenging as systems evolve toward heterogeneity.
Our contribution is a detailed analysis and insight into software and hardware memory management techniques on modern symmetric server-class and asymmetric heterogeneous processors consisting of integrated CPU and GPU cores, and to recently commercially available Processing In Memory (PIM) systems. In particular, we examine three types of systems that are affected by memory access bottlenecks. First, we look at NUMA systems, then integrated CPU-GPU systems, and finally PIM systems. We analyze these systems and suggest possible solutions to mitigate memory access issues.
For NUMA systems, we propose a holistic memory management algorithm that intelligently distributes memory pages to reduce congestion and improve performance. Then, we provide a detailed analysis of memory management methods on integrated CPU-GPU systems focusing on performance and functionality trade-offs. Our goal is to expose the performance impact of memory management techniques and assess the viability and advantage of running applications with complex behaviors on such integrated CPU-GPU systems. Finally, we examine PIM systems with a case study of image decoding, which is a critical stage for many deep learning applications. We show that the extreme compute scalability of PIM systems can be utilized to accelerate image decoding with performance potential that can surpass CPUs and GPUs.
|
Genre | |
Type | |
Language |
eng
|
Date Available |
2022-08-18
|
Provider |
Vancouver : University of British Columbia Library
|
Rights |
Attribution-NonCommercial-NoDerivatives 4.0 International
|
DOI |
10.14288/1.0417420
|
URI | |
Degree | |
Program | |
Affiliation | |
Degree Grantor |
University of British Columbia
|
Graduation Date |
2022-11
|
Campus | |
Scholarly Level |
Graduate
|
Rights URI | |
Aggregated Source Repository |
DSpace
|
Item Media
Item Citations and Data
Rights
Attribution-NonCommercial-NoDerivatives 4.0 International