UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Flexible conditioning in generative models of images and video Harvey, William

Abstract

Recent advances in the field of deep generative modelling are leading to increasingly faithful models of real-world data including images and videos. Of particular practical interest are conditional generative models, which parameterise conditional probability distributions given data features. Flexibly-conditional generative models are more flexible than conventional conditional models in the sense that they allow any data features to be conditioned on. This makes them applicable to tasks like image inpainting where we want the same model that can, e.g., inpaint the top half of an image, to also be capable of, e.g., inpainting the bottom half. Flexible conditioning has previously been demonstrated for data types including fixed-size images and short videos, but our thesis is that it can be enabled in a much broader variety of settings. The first setting we will consider is long-video generation, which is normally problematic because the data is high-dimensional and compute constraints can prevent our model from conditioning on all possible frames. The second is where the data dimensionality (e.g. number of frames in a video) is stochastic and can depend on what we condition on. We present techniques to enable flexible conditioning in both of these settings. We further show that the resulting models can sometimes improve on baselines in terms of sample quality even for conventional generation tasks. Another barrier to flexibly-conditional modelling has been the computational cost of training any high-quality generative models on moderate- or high-resolution visual data. We therefore end by presenting the first technique to mitigate this cost for the training of flexibly-conditional variational auto-encoders by incorporating pretrained unconditional model weights.

Item Media

Item Citations and Data

Rights

Attribution-NonCommercial-NoDerivatives 4.0 International