- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Flexible conditioning in generative models of images...
Open Collections
UBC Theses and Dissertations
UBC Theses and Dissertations
Flexible conditioning in generative models of images and video Harvey, William
Abstract
Recent advances in the field of deep generative modelling are leading to increasingly faithful models of real-world data including images and videos. Of particular practical interest are conditional generative models, which parameterise conditional probability distributions given data features. Flexibly-conditional generative models are more flexible than conventional conditional models in the sense that they allow any data features to be conditioned on. This makes them applicable to tasks like image inpainting where we want the same model that can, e.g., inpaint the top half of an image, to also be capable of, e.g., inpainting the bottom half. Flexible conditioning has previously been demonstrated for data types including fixed-size images and short videos, but our thesis is that it can be enabled in a much broader variety of settings. The first setting we will consider is long-video generation, which is normally problematic because the data is high-dimensional and compute constraints can prevent our model from conditioning on all possible frames. The second is where the data dimensionality (e.g. number of frames in a video) is stochastic and can depend on what we condition on. We present techniques to enable flexible conditioning in both of these settings. We further show that the resulting models can sometimes improve on baselines in terms of sample quality even for conventional generation tasks. Another barrier to flexibly-conditional modelling has been the computational cost of training any high-quality generative models on moderate- or high-resolution visual data. We therefore end by presenting the first technique to mitigate this cost for the training of flexibly-conditional variational auto-encoders by incorporating pretrained unconditional model weights.
Item Metadata
Title |
Flexible conditioning in generative models of images and video
|
Creator | |
Supervisor | |
Publisher |
University of British Columbia
|
Date Issued |
2024
|
Description |
Recent advances in the field of deep generative modelling are leading to increasingly faithful models of real-world data including images and videos. Of particular practical interest are conditional generative models, which parameterise conditional probability distributions given data features. Flexibly-conditional generative models are more flexible than conventional conditional models in the sense that they allow any data features to be conditioned on. This makes them applicable to tasks like image inpainting where we want the same model that can, e.g., inpaint the top half of an image, to also be capable of, e.g., inpainting the bottom half. Flexible conditioning has previously been demonstrated for data types including fixed-size images and short videos, but our thesis is that it can be enabled in a much broader variety of settings. The first setting we will consider is long-video generation, which is normally problematic because the data is high-dimensional and compute constraints can prevent our model from conditioning on all possible frames. The second is where the data dimensionality (e.g. number of frames in a video) is stochastic and can depend on what we condition on. We present techniques to enable flexible conditioning in both of these settings. We further show that the resulting models can sometimes improve on baselines in terms of sample quality even for conventional generation tasks. Another barrier to flexibly-conditional modelling has been the computational cost of training any high-quality generative models on moderate- or high-resolution visual data. We therefore end by presenting the first technique to mitigate this cost for the training of flexibly-conditional variational auto-encoders by incorporating pretrained unconditional model weights.
|
Genre | |
Type | |
Language |
eng
|
Date Available |
2024-08-22
|
Provider |
Vancouver : University of British Columbia Library
|
Rights |
Attribution-NonCommercial-NoDerivatives 4.0 International
|
DOI |
10.14288/1.0445139
|
URI | |
Degree (Theses) | |
Program (Theses) | |
Affiliation | |
Degree Grantor |
University of British Columbia
|
Graduation Date |
2024-11
|
Campus | |
Scholarly Level |
Graduate
|
Rights URI | |
Aggregated Source Repository |
DSpace
|
Item Media
Item Citations and Data
Rights
Attribution-NonCommercial-NoDerivatives 4.0 International