Supernormals : a floating-point format with fine-grained range control and tapered precision for DNN training

UBC Theses and Dissertations

Featured Collection

UBC Theses and Dissertations

Supernormals : a floating-point format with fine-grained range control and tapered precision for DNN training Pun, Shing Wai

Abstract

The scaling of modern deep neural network (DNN) models has led to the use of smaller, lower-precision floating-point formats to meet computational, memory, and energy demands. However, the use of low-precision formats in DNN training introduces severe range and precision restrictions compared to traditional floating-point standards such as IEEE 754 binary32 (FP32), which can lead to poor training results or even divergence. This thesis addresses this by proposing Supernormals, a novel tapered-precision encoding that enables a parameterizable range extension by trading precision for range in specific regions of the encoding space. This thesis demonstrates the capability of Supernormals by systematically varying the independent requirements of numerical range and precision on DNN training at the large, middle, and small magnitude regions, ultimately demonstrating a 6-bit Supernormal format that is able to nearly match 8-bit floating-point formats in DNN training performance. The first part of this thesis explores the range extension capabilities and potential logic area savings of using Supernormals to replace traditional floating-point subnormals and special encodings. This approach replaces both the smallest and largest exponent encodings in a floating-point format with a configurable 0-bit mantissa (M0) regime, providing range extension. This reduces subnormal logic overhead by trading precision for extended range, while capturing the magnitude of infrequent tail values seen during DNN training (e.g., small gradients) to improve training stability. The second part of the thesis explores the precise numerical requirements of DNN training under tensor-level scaling, demonstrating the capability of the Supernormal encoding by systematically quantifying the independent impacts of range and precision across varying numerical magnitudes on DNN training stability, and then evaluating a 6-bit Supernormal floating-point format against existing 6 and 8-bit formats.

Item Metadata

Title	Supernormals : a floating-point format with fine-grained range control and tapered precision for DNN training
Creator	Pun, Shing Wai
Supervisor	Lemieux, Guy; Filip, Silviu-Ioan
Publisher	University of British Columbia
Date Issued	2026
Description	The scaling of modern deep neural network (DNN) models has led to the use of smaller, lower-precision floating-point formats to meet computational, memory, and energy demands. However, the use of low-precision formats in DNN training introduces severe range and precision restrictions compared to traditional floating-point standards such as IEEE 754 binary32 (FP32), which can lead to poor training results or even divergence. This thesis addresses this by proposing Supernormals, a novel tapered-precision encoding that enables a parameterizable range extension by trading precision for range in specific regions of the encoding space. This thesis demonstrates the capability of Supernormals by systematically varying the independent requirements of numerical range and precision on DNN training at the large, middle, and small magnitude regions, ultimately demonstrating a 6-bit Supernormal format that is able to nearly match 8-bit floating-point formats in DNN training performance. The first part of this thesis explores the range extension capabilities and potential logic area savings of using Supernormals to replace traditional floating-point subnormals and special encodings. This approach replaces both the smallest and largest exponent encodings in a floating-point format with a configurable 0-bit mantissa (M0) regime, providing range extension. This reduces subnormal logic overhead by trading precision for extended range, while capturing the magnitude of infrequent tail values seen during DNN training (e.g., small gradients) to improve training stability. The second part of the thesis explores the precise numerical requirements of DNN training under tensor-level scaling, demonstrating the capability of the Supernormal encoding by systematically quantifying the independent impacts of range and precision across varying numerical magnitudes on DNN training stability, and then evaluating a 6-bit Supernormal floating-point format against existing 6 and 8-bit formats.
Genre	Thesis/Dissertation
Type	Text
Language	eng
Date Available	2026-04-22
Provider	Vancouver : University of British Columbia Library
Rights	Attribution-NonCommercial-NoDerivatives 4.0 International
DOI	10.14288/1.0452063
URI	http://hdl.handle.net/2429/94182
Degree (Theses)	Master of Applied Science - MASc
Program (Theses)	Electrical and Computer Engineering
Affiliation	Applied Science, Faculty of; Electrical and Computer Engineering, Department of
Degree Grantor	University of British Columbia
Graduation Date	2026-05
Campus	UBCV
Scholarly Level	Graduate
Rights URI	http://creativecommons.org/licenses/by-nc-nd/4.0/
Aggregated Source Repository	DSpace

Open Collections

UBC Theses and Dissertations

UBC Theses and Dissertations

Supernormals : a floating-point format with fine-grained range control and tapered precision for DNN training Pun, Shing Wai

Abstract

Item Metadata

Item Media

Item Citations and Data

Rights