Fault injection in Machine Learning applications

UBC Theses and Dissertations

Featured Collection

UBC Theses and Dissertations

Fault injection in Machine Learning applications Narayanan, Niranjhana

Abstract

As Machine Learning (ML) has seen increasing adoption in safety-critical domains (e.g., autonomous vehicles), the reliability of ML systems has also grown in importance. While prior studies have proposed techniques to enable efficient error-resilience (e.g., selective instruction duplication), a fundamental requirement for realizing these techniques is a detailed understanding of the application's resilience. The primary part of this thesis focuses on studying ML application resilience to hardware and software faults. To this end, we present the TensorFI tool set, consisting of TensorFI 1 and 2 which are high-level fault injection frameworks for TensorFlow 1 and 2 respectively. With this tool set, we inject faults in TensorFlow programs and study important reliability aspects such as model resilience to different kinds of faults, operator and layer level resilience of different models or the effect of hyperparameter variations. We evaluate the resilience of 12 ML applications, including those used in the autonomous vehicle domain. From our experiments, we find that there are significant differences between different ML applications and different configurations. Further, we find that applications are more vulnerable to bit-flip faults than other kinds of faults. We conduct four case studies to demonstrate some use cases of the tool set. We find the most and least resilient image classes to faults in a traffic sign recognition model. We consider layer-wise resilience and observe that faults in the initial layers of an application result in higher vulnerability. In addition, we visualize the outputs from layer-wise injection in an image segmentation model, and are able to identify the layer in which faults occurred based on the faulty prediction masks. These case studies thus provide valuable insights into how to improve the resilience of ML applications. The secondary part of this thesis focuses on studying ML application resilience to data faults (e.g. adversarial inputs, labeling errors, common corruptions/noisy data). We present a data mutation tool, TensorFlow Data Mutator (TF-DM), which targets different kinds of data faults commonly occurring in ML applications. We conduct experiments using TF-DM and outline the resiliency analysis of different models and datasets.

Item Metadata

Title	Fault injection in Machine Learning applications
Creator	Narayanan, Niranjhana
Publisher	University of British Columbia
Date Issued	2021
Description	As Machine Learning (ML) has seen increasing adoption in safety-critical domains (e.g., autonomous vehicles), the reliability of ML systems has also grown in importance. While prior studies have proposed techniques to enable efficient error-resilience (e.g., selective instruction duplication), a fundamental requirement for realizing these techniques is a detailed understanding of the application's resilience. The primary part of this thesis focuses on studying ML application resilience to hardware and software faults. To this end, we present the TensorFI tool set, consisting of TensorFI 1 and 2 which are high-level fault injection frameworks for TensorFlow 1 and 2 respectively. With this tool set, we inject faults in TensorFlow programs and study important reliability aspects such as model resilience to different kinds of faults, operator and layer level resilience of different models or the effect of hyperparameter variations. We evaluate the resilience of 12 ML applications, including those used in the autonomous vehicle domain. From our experiments, we find that there are significant differences between different ML applications and different configurations. Further, we find that applications are more vulnerable to bit-flip faults than other kinds of faults. We conduct four case studies to demonstrate some use cases of the tool set. We find the most and least resilient image classes to faults in a traffic sign recognition model. We consider layer-wise resilience and observe that faults in the initial layers of an application result in higher vulnerability. In addition, we visualize the outputs from layer-wise injection in an image segmentation model, and are able to identify the layer in which faults occurred based on the faulty prediction masks. These case studies thus provide valuable insights into how to improve the resilience of ML applications. The secondary part of this thesis focuses on studying ML application resilience to data faults (e.g. adversarial inputs, labeling errors, common corruptions/noisy data). We present a data mutation tool, TensorFlow Data Mutator (TF-DM), which targets different kinds of data faults commonly occurring in ML applications. We conduct experiments using TF-DM and outline the resiliency analysis of different models and datasets.
Genre	Thesis/Dissertation
Type	Text
Language	eng
Date Available	2021-04-23
Provider	Vancouver : University of British Columbia Library
Rights	Attribution-NonCommercial-NoDerivatives 4.0 International
DOI	10.14288/1.0396949
URI	http://hdl.handle.net/2429/77963
Degree (Theses)	Master of Applied Science - MASc
Program (Theses)	Electrical and Computer Engineering
Affiliation	Applied Science, Faculty of; Electrical and Computer Engineering, Department of
Degree Grantor	University of British Columbia
Graduation Date	2021-05
Campus	UBCV
Scholarly Level	Graduate
Rights URI	http://creativecommons.org/licenses/by-nc-nd/4.0/
Aggregated Source Repository	DSpace

Open Collections

UBC Theses and Dissertations

UBC Theses and Dissertations

Fault injection in Machine Learning applications Narayanan, Niranjhana

Abstract

Item Metadata

Item Media

Item Citations and Data

Rights