Open Collections

UBC Theses and Dissertations

Featured Collection

UBC Theses and Dissertations

Structured representation learning by controlling generative models He, Xingzhe

Abstract

Object correspondence and structure play critical roles in image generation, 3D reconstruction and animation. In recent years, supervised algorithms have dramatically improved the accuracy of the learned correspondence. However, these approaches are expensive due to manual annotation and do not generalize well to new domains. We propose several methods in this dissertation on unsupervised structure learning from casually recorded images and videos. To be specific, we propose a Generative Adversarial Network (GAN)-based unsupervised keypoint detector and extend it for object part segmentation. Furthermore, we introduce a representation for unsupervised keypoints relationship estimation. We later adapted this technique for few-shot keypoint learning, depth prediction, and occlusion handling. In addition, we propose a dataset generation approach for diffusion model personalization to implicitly learn the object structure and appearance. The overarching goal of this dissertation is to make progress in the field of unsupervised object correspondence and structure learning. Our proposed methods outperform existing unsupervised methods on the established keypoint estimation and part segmentation benchmarks and paves the way for structure-conditioned generative models on more diverse datasets.

Item Metadata

Title	Structured representation learning by controlling generative models
Creator	He, Xingzhe
Supervisor	Rhodin, Helge
Publisher	University of British Columbia
Date Issued	2024
Description	Object correspondence and structure play critical roles in image generation, 3D reconstruction and animation. In recent years, supervised algorithms have dramatically improved the accuracy of the learned correspondence. However, these approaches are expensive due to manual annotation and do not generalize well to new domains. We propose several methods in this dissertation on unsupervised structure learning from casually recorded images and videos. To be specific, we propose a Generative Adversarial Network (GAN)-based unsupervised keypoint detector and extend it for object part segmentation. Furthermore, we introduce a representation for unsupervised keypoints relationship estimation. We later adapted this technique for few-shot keypoint learning, depth prediction, and occlusion handling. In addition, we propose a dataset generation approach for diffusion model personalization to implicitly learn the object structure and appearance. The overarching goal of this dissertation is to make progress in the field of unsupervised object correspondence and structure learning. Our proposed methods outperform existing unsupervised methods on the established keypoint estimation and part segmentation benchmarks and paves the way for structure-conditioned generative models on more diverse datasets.
Genre	Thesis/Dissertation
Type	Text
Language	eng
Date Available	2024-03-06
Provider	Vancouver : University of British Columbia Library
Rights	Attribution-NonCommercial-NoDerivatives 4.0 International
DOI	10.14288/1.0440629
URI	http://hdl.handle.net/2429/87531
Degree	Doctor of Philosophy - PhD
Program	Computer Science
Affiliation	Science, Faculty of; Computer Science, Department of
Degree Grantor	University of British Columbia
Graduation Date	2024-05
Campus	UBCV
Scholarly Level	Graduate
Rights URI	http://creativecommons.org/licenses/by-nc-nd/4.0/
Aggregated Source Repository	DSpace

Open Collections

UBC Theses and Dissertations

UBC Theses and Dissertations

Structured representation learning by controlling generative models He, Xingzhe

Abstract

Item Metadata

Item Media

Item Citations and Data

Rights