Towards efficient and intelligent tinyML : acceleration, architectures, and monitoring

UBC Theses and Dissertations

Featured Collection

UBC Theses and Dissertations

Towards efficient and intelligent tinyML : acceleration, architectures, and monitoring Pratap Ghanathe, Nikhil

Abstract

Deploying Machine learning (ML) on the milliwatt-scale edge devices (TinyML) is gaining popularity due to recent breakthroughs in ML and Internet of Things (IoT), coupled with hardware and tooling innovations. TinyML applications are always-on and require all computations to run locally on the kilobyte-sized tiny-edge devices, opening new avenues for real-time inference and autonomous decision-making at the extreme-edge. However, there are several challenges in obtaining energy-efficient tinyML solutions. The primary challenge for tinyML today is that tinyML devices are severely constrained in terms of memory, compute and energy consumption (battery life). This significantly limits the accuracy and complexity of models that can be deployed on such devices. Traditional ML models often require substantial computational resources, making them unsuitable for these constrained environments. Additionally, TinyML devices are often deployed in remote, non-stationary environments where access is costly. As a result, post-deployment processes such as monitoring and on-device learning are crucial for maintaining system performance and validity over time. However, contemporary solutions incur large overheads, rendering them impractical for tinyML. TinyML addresses these challenges through a combination of hardware (e.g., specialized low-power chips) and software (e.g., model optimization, approximate computing) innovations. Despite these advancements, the ecosystem around TinyML remains immature, as power, memory, and compute constraints continue to impede its full potential. Developing robust solutions that strike a balance between efficiency and performance is essential for advancing TinyML technology. To that end, our research makes progress on three fronts: 1) Exploring hardware-friendly solutions. In particular, we develop a tool that compiles a high-level ML specification of classical-ML algorithms to high-performance Verilog code. This enables acceleration on low-power field-programmable gate arrays (FPGAs), optimizing for energy efficiency without sacrificing performance. 2) Algorithmic and architectural innovations. We develop a novel early-exit architecture optimized for tinyML models that reduces the average execution time by 32% with minimal compromise on accuracy. 3) Improving model reliability. To improve model visibility, we investigate advanced on-device monitoring and diagnostic techniques tailored to the strict constraints of tinyML. This thesis aims to promote a mature ecosystem for tinyML development and deployment by making contributions across these three key areas.

Item Metadata

Title	Towards efficient and intelligent tinyML : acceleration, architectures, and monitoring
Creator	Pratap Ghanathe, Nikhil
Supervisor	Wilton, Steve
Publisher	University of British Columbia
Date Issued	2025
Description	Deploying Machine learning (ML) on the milliwatt-scale edge devices (TinyML) is gaining popularity due to recent breakthroughs in ML and Internet of Things (IoT), coupled with hardware and tooling innovations. TinyML applications are always-on and require all computations to run locally on the kilobyte-sized tiny-edge devices, opening new avenues for real-time inference and autonomous decision-making at the extreme-edge. However, there are several challenges in obtaining energy-efficient tinyML solutions. The primary challenge for tinyML today is that tinyML devices are severely constrained in terms of memory, compute and energy consumption (battery life). This significantly limits the accuracy and complexity of models that can be deployed on such devices. Traditional ML models often require substantial computational resources, making them unsuitable for these constrained environments. Additionally, TinyML devices are often deployed in remote, non-stationary environments where access is costly. As a result, post-deployment processes such as monitoring and on-device learning are crucial for maintaining system performance and validity over time. However, contemporary solutions incur large overheads, rendering them impractical for tinyML. TinyML addresses these challenges through a combination of hardware (e.g., specialized low-power chips) and software (e.g., model optimization, approximate computing) innovations. Despite these advancements, the ecosystem around TinyML remains immature, as power, memory, and compute constraints continue to impede its full potential. Developing robust solutions that strike a balance between efficiency and performance is essential for advancing TinyML technology. To that end, our research makes progress on three fronts: 1) Exploring hardware-friendly solutions. In particular, we develop a tool that compiles a high-level ML specification of classical-ML algorithms to high-performance Verilog code. This enables acceleration on low-power field-programmable gate arrays (FPGAs), optimizing for energy efficiency without sacrificing performance. 2) Algorithmic and architectural innovations. We develop a novel early-exit architecture optimized for tinyML models that reduces the average execution time by 32% with minimal compromise on accuracy. 3) Improving model reliability. To improve model visibility, we investigate advanced on-device monitoring and diagnostic techniques tailored to the strict constraints of tinyML. This thesis aims to promote a mature ecosystem for tinyML development and deployment by making contributions across these three key areas.
Genre	Thesis/Dissertation
Type	Text
Language	eng
Date Available	2025-03-18
Provider	Vancouver : University of British Columbia Library
Rights	Attribution-NonCommercial-NoDerivatives 4.0 International
DOI	10.14288/1.0448215
URI	http://hdl.handle.net/2429/90501
Degree	Doctor of Philosophy - PhD
Program	Electrical and Computer Engineering
Affiliation	Applied Science, Faculty of; Electrical and Computer Engineering, Department of
Degree Grantor	University of British Columbia
Graduation Date	2025-05
Campus	UBCV
Scholarly Level	Graduate
Rights URI	http://creativecommons.org/licenses/by-nc-nd/4.0/
Aggregated Source Repository	DSpace

Open Collections

UBC Theses and Dissertations

UBC Theses and Dissertations

Towards efficient and intelligent tinyML : acceleration, architectures, and monitoring Pratap Ghanathe, Nikhil

Abstract

Item Metadata

Item Media

Item Citations and Data

Rights