Effective debug ecosystem for machine learning hardware accelerators

UBC Theses and Dissertations

Featured Collection

UBC Theses and Dissertations

Effective debug ecosystem for machine learning hardware accelerators Holanda Noronha, Daniel

Abstract

Recent years have seen a dramatic increase in the use of hardware accelerators to perform machine learning computations. Designing these circuits is challenging, especially due to bugs that may only manifest after long run times, and interactions between hardware and software that are complex to understand. As a result, debugging the machine learning accelerator and ensuring that the system is delivering acceptable performance are very time-consuming processes that significantly limit productivity. This dissertation focuses on investigating how additional circuitry added to machine learning hardware designs may allow for the effective debugging of those systems and on gathering insights on how to better debug those systems. More specifically, we focus on techniques that are suitable for accelerators implemented using Field-Programmable Gate Arrays (FPGA), since many accelerators are prototyped using this type of reconfigurable fabric. This dissertation is comprised of four major contributions towards this goal. First, we present a debugging framework that allows the designer to observe domain-specific information (e.g. sparsity and other statistics) about the machine learning workload running on the accelerator, rather than raw information that is expensive to trace. This includes the creation of a custom circuitry that allows information to be recorded for at least 21.8x longer than in previous debugging architectures. Second, we create a technique to reduce the time between debug iterations by creating an overlay that enables designers to change which signals are being traced and select how those signals are being aggregated without the need of resynthesizing the design. Third, we investigate how to debug underperforming training jobs, resulting in a novel programmable debug architecture that allows designers to create custom ways of aggregating data at debug time, instead of being constrained by a few options selected at compile time. Finally, we investigate the impacts of hardware acceleration on the optimization landscape of training systems and how this information may be used to accelerate debug. We anticipate that the concepts presented in this thesis will be used to allow designers to aggregate domain-specific information in future commercial debugging tools and to motivate future work on improving the training performance of hardware accelerators.

Item Metadata

Title	Effective debug ecosystem for machine learning hardware accelerators
Creator	Holanda Noronha, Daniel
Supervisor	Wilton, Steve
Publisher	University of British Columbia
Date Issued	2022
Description	Recent years have seen a dramatic increase in the use of hardware accelerators to perform machine learning computations. Designing these circuits is challenging, especially due to bugs that may only manifest after long run times, and interactions between hardware and software that are complex to understand. As a result, debugging the machine learning accelerator and ensuring that the system is delivering acceptable performance are very time-consuming processes that significantly limit productivity. This dissertation focuses on investigating how additional circuitry added to machine learning hardware designs may allow for the effective debugging of those systems and on gathering insights on how to better debug those systems. More specifically, we focus on techniques that are suitable for accelerators implemented using Field-Programmable Gate Arrays (FPGA), since many accelerators are prototyped using this type of reconfigurable fabric. This dissertation is comprised of four major contributions towards this goal. First, we present a debugging framework that allows the designer to observe domain-specific information (e.g. sparsity and other statistics) about the machine learning workload running on the accelerator, rather than raw information that is expensive to trace. This includes the creation of a custom circuitry that allows information to be recorded for at least 21.8x longer than in previous debugging architectures. Second, we create a technique to reduce the time between debug iterations by creating an overlay that enables designers to change which signals are being traced and select how those signals are being aggregated without the need of resynthesizing the design. Third, we investigate how to debug underperforming training jobs, resulting in a novel programmable debug architecture that allows designers to create custom ways of aggregating data at debug time, instead of being constrained by a few options selected at compile time. Finally, we investigate the impacts of hardware acceleration on the optimization landscape of training systems and how this information may be used to accelerate debug. We anticipate that the concepts presented in this thesis will be used to allow designers to aggregate domain-specific information in future commercial debugging tools and to motivate future work on improving the training performance of hardware accelerators.
Genre	Thesis/Dissertation
Type	Text
Language	eng
Date Available	2022-03-11
Provider	Vancouver : University of British Columbia Library
Rights	Attribution-NonCommercial-NoDerivatives 4.0 International
DOI	10.14288/1.0407136
URI	http://hdl.handle.net/2429/80946
Degree (Theses)	Doctor of Philosophy - PhD
Program (Theses)	Electrical and Computer Engineering
Affiliation	Applied Science, Faculty of; Electrical and Computer Engineering, Department of
Degree Grantor	University of British Columbia
Graduation Date	2022-05
Campus	UBCV
Scholarly Level	Graduate
Rights URI	http://creativecommons.org/licenses/by-nc-nd/4.0/
Aggregated Source Repository	DSpace

Open Collections

UBC Theses and Dissertations

UBC Theses and Dissertations

Effective debug ecosystem for machine learning hardware accelerators Holanda Noronha, Daniel

Abstract

Item Metadata

Item Media

Item Citations and Data

Rights