- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Weakly-supervised geometry-aware novel view synthesis
Open Collections
UBC Theses and Dissertations
UBC Theses and Dissertations
Weakly-supervised geometry-aware novel view synthesis Perryman, Olivia
Abstract
Enabling computers to understand and interpret visual information is crucial for the development of more sophisticated and interactive technologies. Learning structure, such as 3D shape, can help computers understand visual information more effectively. Our model disentangles object structure and appearance in a weakly supervised manner from multiview images within a single category. We extract a 3D pointcloud from images and reconstruct consistent novel views by rendering the pointclouds from different perspectives. Using a much simpler model and far fewer training examples than costly state-of-the-art diffusion models, we can recover 3D structure from single images of objects and quickly reconstruct them from unseen viewpoints. Our findings suggest that understanding 3D constraints of the real world can enhance performance on visual tasks and make models more robust and generalizable to a wider variety of inputs. This approach enables downstream tasks such as pose transfer, spatially-guided conditional image generation, and paves the way for commonsense reasoning. Our work has potential applications in augmented reality and visual effects, with further exploration of the model's capabilities and integration into broader systems for enhanced visual understanding.
Item Metadata
Title |
Weakly-supervised geometry-aware novel view synthesis
|
Creator | |
Supervisor | |
Publisher |
University of British Columbia
|
Date Issued |
2024
|
Description |
Enabling computers to understand and interpret visual information is crucial for the development of more sophisticated and interactive technologies. Learning structure, such as 3D shape, can help computers understand visual information more effectively. Our model disentangles object structure and appearance in a weakly supervised manner from multiview images within a single category. We extract a 3D pointcloud from images and reconstruct consistent novel views by rendering the pointclouds from different perspectives. Using a much simpler model and far fewer training examples than costly state-of-the-art diffusion models, we can recover 3D structure from single images of objects and quickly reconstruct them from unseen viewpoints. Our findings suggest that understanding 3D constraints of the real world can enhance performance on visual tasks and make models more robust and generalizable to a wider variety of inputs. This approach enables downstream tasks such as pose transfer, spatially-guided conditional image generation, and paves the way for commonsense reasoning. Our work has potential applications in augmented reality and visual effects, with further exploration of the model's capabilities and integration into broader systems for enhanced visual understanding.
|
Genre | |
Type | |
Language |
eng
|
Date Available |
2024-08-30
|
Provider |
Vancouver : University of British Columbia Library
|
Rights |
Attribution-NonCommercial-NoDerivatives 4.0 International
|
DOI |
10.14288/1.0445271
|
URI | |
Degree | |
Program | |
Affiliation | |
Degree Grantor |
University of British Columbia
|
Graduation Date |
2024-11
|
Campus | |
Scholarly Level |
Graduate
|
Rights URI | |
Aggregated Source Repository |
DSpace
|
Item Media
Item Citations and Data
Rights
Attribution-NonCommercial-NoDerivatives 4.0 International