UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

An efficient middle-out prediction structure for light field video compression using MV-HEVC Khoury, Joseph

Abstract

Light field imaging has emerged as a technology that enables the capture of richer visual information. While traditional photography captures just a 2D projection of the light in the scene, a light field camera collects the radiance from rays in all directions and extracts the angular information that is otherwise lost in conventional photography. This angular information can be used to substantially improve immersiveness, focus, depth, color, intensity and perspective, opening up new market opportunities. Nevertheless, the high-dimensionality of light fields also brings with it its own new challenges such as the size of the captured data. Research in light field image compression is becoming increasingly popular, but light field video compression remains a relatively under-explored field. State of the art solutions attempt to apply existing multi-view coding (MVC) methods to encode light field videos. While these solutions show potential, they do not manage to address the bandwidth problem imposed by the size of data involved. Hence, there is a real need for improvement, taking advantage of the additional redundancies of light field video and the intricacies of this data. In this thesis, we proposed a three-dimensional prediction structure for efficiently coding light video using the MV-HEVC standard. First, we modify the inter-view structure in order to exploit the higher similarity found around the central set of views compared to those around the edges. In addition to this, the selection of which views start with a P-frame takes into consideration maximizing their utilization as references by other views. Secondly, we build upon this structure by expanding the GOP size and creating a more efficient temporal structure that better utilizes the higher-fidelity sequences as references for subsequent frames. The schema contains various temporal structures for compressing the views, which is based on their encoding order. This facilitates the latter views relying more heavily on frames from other views as references in order to compensate for the fact that the preceding frames are more quantized by comparison.

Item Citations and Data

Rights

Attribution-NonCommercial-NoDerivatives 4.0 International