UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Generalization bounds and size generalization for graph neural networks Sales, Emmanuel

Abstract

Graph neural networks (GNNs) are a class of machine learning models that relax the independent and identically distributed (i.i.d.) assumption between data points that underlies most machine learning models. Theoretical understanding of these models involves analyzing generalization bounds, a theoretical framework for finding the provable discrepancies between expected train and test loss. We make advancements in state-of-the-art PAC-Bayes generalization bounds for GNNs using insights from graph theory and random matrix theory, and perform experiments for validation. One of the most important directions in the study of modern theoretical machine learning is the analysis of out-of-distribution error; that is, error measured particularly on examples from a distinct distribution from the training distribution. In particular for the graph learning setting and GNNs, there are importatnt questions that can be explored about size generalization, the capacity for a graph neural network to make predictions on graphs much larger than seen on its training set. We develop a theoretical framework for size generalization with the analysis of graph learning settings where GNNs can easily perform size generalization, and develop probabilistic theorems analyzing some measures of generalization error, building off of the work done in the PAC-Bayes analysis.

Item Media

Item Citations and Data

Rights

Attribution-NonCommercial-NoDerivatives 4.0 International