UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Advances in image and video quality enhancement Ma, Zhenchao

Abstract

The rapid growth of imaging technologies and the proliferation of visual data have made image processing and quality enhancement critical across multiple fields, including multimedia systems and artificial intelligence. Impressive advancements in image and video capturing and displaying as well as enhanced internet access and innovative broadcasting and streaming technologies have significantly elevated image and video quality. However, this improvement comes at the cost of bandwidth and storage requirements. At the same time, technological advances and visual quality improvements tend to raise consumer expectations. While compression codecs reduce the size of images and videos, enabling efficient storage and transmission, they inevitably introduce various compression artifacts, such as blockiness, blurring, and flickering. These artifacts not only degrade human visual quality but also reduce the performance of machine learning tasks. This thesis addresses these challenges by developing innovative methods in three key areas: image and video compression artifact reduction, image quality enhancement for machine consumption, and image super-resolution. First, we propose a Dynamic Window Swin Transformer for image compression artifacts reduction, introducing a content-adaptive dynamic window mechanism to capture dependencies. This approach significantly reduces the compression artifacts and improves visual quality in images degraded by modern compression standards. For video quality enhancement, we adapt the Recurrent Video Restoration Transformer (RVRT) with guided deformable attention to reduce HEVC compression artifacts, achieving significant improvements. For image enhancement for machine consumption, we propose a joint restoration-classification network that combines image enhancement with classification, using a linear combination loss to optimize both restoration quality and classification accuracy, particularly for compressed images. Finally, we tackle stereo image super-resolution with StereoMamba+, a novel framework leveraging the Mamba architecture to adaptively capture local and global dependencies in stereo pairs. StereoMamba+ integrates an Adaptive State Space Module (ASSM), a Gated Enhanced Feed-Forward Network (GEFN), and a Stereo Bi-Directional Cross-Attention Module (SBCAM) to enhance resolution and stereo consistency. In summary, this thesis advances image and video processing by developing methods that improve visual quality for both human perception and machine consumption. Our contributions aim to bridge the gap between theoretical understanding and practical applications in image and video quality enhancement.

Item Media

Item Citations and Data

Rights

Attribution-NonCommercial-NoDerivatives 4.0 International