- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- The effectiveness of GNNs for node classification :...
Open Collections
UBC Theses and Dissertations
UBC Theses and Dissertations
The effectiveness of GNNs for node classification : the significance of side information Liu, Xiaoou
Abstract
This thesis studies the effectiveness of Graph Neural Networks (GNNs) for node classification. We conduct systematic experiments on several representative deep-learning models for graph data, using training data generated from the Stochastic Block Model (SBM) and the theoretical results on the fundamental limits of this model as guidance in the design of our experiments. While GNNs are widely believed to be powerful learning models for graph data, our empirical findings suggest that they do not necessarily outperform other machine-learning-based methods and traditional algorithms for node classification. In particular, we observe that GNN-based methods fail to exploit the information from labeled nodes in semi-supervised learning settings. We propose an effective data augmentation method to enhance GNN-based methods by making better use of labeled information in the training data. Our experiments using synthetic data from SBMs and real-world datasets demonstrate that our method can significantly enhance the capabilities of GNN models and notably improve their performance for node classification. Additionally, in the context of unsupervised learning, we discuss the possibility of incorporating other types of side information into GNNs that may exist in multiplex network data.
Item Metadata
Title |
The effectiveness of GNNs for node classification : the significance of side information
|
Creator | |
Supervisor | |
Publisher |
University of British Columbia
|
Date Issued |
2024
|
Description |
This thesis studies the effectiveness of Graph Neural Networks (GNNs) for node classification. We conduct systematic experiments on several representative deep-learning models for graph data, using training data generated from the Stochastic Block Model (SBM) and the theoretical results on the fundamental limits of this model as guidance in the design of our experiments. While GNNs are widely believed to be powerful learning models for graph data, our empirical findings suggest that they do not necessarily outperform other machine-learning-based methods and traditional algorithms for node classification. In particular, we observe that GNN-based methods fail to exploit the information from labeled nodes in semi-supervised learning settings. We propose an effective data augmentation method to enhance GNN-based methods by making better use of labeled information in the training data. Our experiments using synthetic data from SBMs and real-world datasets demonstrate that our method can significantly enhance the capabilities of GNN models and notably improve their performance for node classification. Additionally, in the context of unsupervised learning, we discuss the possibility of incorporating other types of side information into GNNs that may exist in multiplex network data.
|
Genre | |
Type | |
Language |
eng
|
Date Available |
2024-06-17
|
Provider |
Vancouver : University of British Columbia Library
|
Rights |
Attribution-NonCommercial-NoDerivatives 4.0 International
|
DOI |
10.14288/1.0443981
|
URI | |
Degree | |
Program | |
Affiliation | |
Degree Grantor |
University of British Columbia
|
Graduation Date |
2024-09
|
Campus | |
Scholarly Level |
Graduate
|
Rights URI | |
Aggregated Source Repository |
DSpace
|
Item Media
Item Citations and Data
Rights
Attribution-NonCommercial-NoDerivatives 4.0 International