Generalization bounds and size generalization for graph neural networks

UBC Theses and Dissertations

Featured Collection

UBC Theses and Dissertations

Generalization bounds and size generalization for graph neural networks Sales, Emmanuel

Abstract

Graph neural networks (GNNs) are a class of machine learning models that relax the independent and identically distributed (i.i.d.) assumption between data points that underlies most machine learning models. Theoretical understanding of these models involves analyzing generalization bounds, a theoretical framework for finding the provable discrepancies between expected train and test loss. We make advancements in state-of-the-art PAC-Bayes generalization bounds for GNNs using insights from graph theory and random matrix theory, and perform experiments for validation. One of the most important directions in the study of modern theoretical machine learning is the analysis of out-of-distribution error; that is, error measured particularly on examples from a distinct distribution from the training distribution. In particular for the graph learning setting and GNNs, there are importatnt questions that can be explored about size generalization, the capacity for a graph neural network to make predictions on graphs much larger than seen on its training set. We develop a theoretical framework for size generalization with the analysis of graph learning settings where GNNs can easily perform size generalization, and develop probabilistic theorems analyzing some measures of generalization error, building off of the work done in the PAC-Bayes analysis.

Item Metadata

Title	Generalization bounds and size generalization for graph neural networks
Creator	Sales, Emmanuel
Supervisor	Harvey, Nicholas J. A.
Publisher	University of British Columbia
Date Issued	2022
Description	Graph neural networks (GNNs) are a class of machine learning models that relax the independent and identically distributed (i.i.d.) assumption between data points that underlies most machine learning models. Theoretical understanding of these models involves analyzing generalization bounds, a theoretical framework for finding the provable discrepancies between expected train and test loss. We make advancements in state-of-the-art PAC-Bayes generalization bounds for GNNs using insights from graph theory and random matrix theory, and perform experiments for validation. One of the most important directions in the study of modern theoretical machine learning is the analysis of out-of-distribution error; that is, error measured particularly on examples from a distinct distribution from the training distribution. In particular for the graph learning setting and GNNs, there are importatnt questions that can be explored about size generalization, the capacity for a graph neural network to make predictions on graphs much larger than seen on its training set. We develop a theoretical framework for size generalization with the analysis of graph learning settings where GNNs can easily perform size generalization, and develop probabilistic theorems analyzing some measures of generalization error, building off of the work done in the PAC-Bayes analysis.
Genre	Thesis/Dissertation
Type	Text
Language	eng
Date Available	2022-08-11
Provider	Vancouver : University of British Columbia Library
Rights	Attribution-NonCommercial-NoDerivatives 4.0 International
DOI	10.14288/1.0417272
URI	http://hdl.handle.net/2429/82330
Degree	Master of Science - MSc
Program	Computer Science
Affiliation	Science, Faculty of; Computer Science, Department of
Degree Grantor	University of British Columbia
Graduation Date	2022-11
Campus	UBCV
Scholarly Level	Graduate
Rights URI	http://creativecommons.org/licenses/by-nc-nd/4.0/
Aggregated Source Repository	DSpace

Open Collections

UBC Theses and Dissertations

UBC Theses and Dissertations

Generalization bounds and size generalization for graph neural networks Sales, Emmanuel

Abstract

Item Metadata

Item Media

Item Citations and Data

Rights