Weakly-supervised geometry-aware novel view synthesis

UBC Theses and Dissertations

Featured Collection

UBC Theses and Dissertations

Weakly-supervised geometry-aware novel view synthesis Perryman, Olivia

Abstract

Enabling computers to understand and interpret visual information is crucial for the development of more sophisticated and interactive technologies. Learning structure, such as 3D shape, can help computers understand visual information more effectively. Our model disentangles object structure and appearance in a weakly supervised manner from multiview images within a single category. We extract a 3D pointcloud from images and reconstruct consistent novel views by rendering the pointclouds from different perspectives. Using a much simpler model and far fewer training examples than costly state-of-the-art diffusion models, we can recover 3D structure from single images of objects and quickly reconstruct them from unseen viewpoints. Our findings suggest that understanding 3D constraints of the real world can enhance performance on visual tasks and make models more robust and generalizable to a wider variety of inputs. This approach enables downstream tasks such as pose transfer, spatially-guided conditional image generation, and paves the way for commonsense reasoning. Our work has potential applications in augmented reality and visual effects, with further exploration of the model's capabilities and integration into broader systems for enhanced visual understanding.

Item Metadata

Title	Weakly-supervised geometry-aware novel view synthesis
Creator	Perryman, Olivia
Supervisor	Yi, Kwang Moo; Rhodin, Helge
Publisher	University of British Columbia
Date Issued	2024
Description	Enabling computers to understand and interpret visual information is crucial for the development of more sophisticated and interactive technologies. Learning structure, such as 3D shape, can help computers understand visual information more effectively. Our model disentangles object structure and appearance in a weakly supervised manner from multiview images within a single category. We extract a 3D pointcloud from images and reconstruct consistent novel views by rendering the pointclouds from different perspectives. Using a much simpler model and far fewer training examples than costly state-of-the-art diffusion models, we can recover 3D structure from single images of objects and quickly reconstruct them from unseen viewpoints. Our findings suggest that understanding 3D constraints of the real world can enhance performance on visual tasks and make models more robust and generalizable to a wider variety of inputs. This approach enables downstream tasks such as pose transfer, spatially-guided conditional image generation, and paves the way for commonsense reasoning. Our work has potential applications in augmented reality and visual effects, with further exploration of the model's capabilities and integration into broader systems for enhanced visual understanding.
Genre	Thesis/Dissertation
Type	Text
Language	eng
Date Available	2024-08-30
Provider	Vancouver : University of British Columbia Library
Rights	Attribution-NonCommercial-NoDerivatives 4.0 International
DOI	10.14288/1.0445271
URI	http://hdl.handle.net/2429/89126
Degree	Master of Science - MSc
Program	Computer Science
Affiliation	Science, Faculty of; Computer Science, Department of
Degree Grantor	University of British Columbia
Graduation Date	2024-11
Campus	UBCV
Scholarly Level	Graduate
Rights URI	http://creativecommons.org/licenses/by-nc-nd/4.0/
Aggregated Source Repository	DSpace

Open Collections

UBC Theses and Dissertations

UBC Theses and Dissertations

Weakly-supervised geometry-aware novel view synthesis Perryman, Olivia

Abstract

Item Metadata

Item Media

Item Citations and Data

Rights