UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Image and video classification and image similarity measurement by learning sparse representation Guha, Tanaya


Sparse representation of signals has recently emerged as a major research area. It is well-known that many natural signals can be sparsely represented using a properly chosen dictionary (e.g. formed of wavelets bases). A dictionary could be complete or overcomplete depending on whether the number of bases it contains is the same or greater than the dimensionality of the given signal. Traditionally, the use of predefined dictionaries has been prevalent in sparse analysis. However, a more generalized approach is to learn the dictionary from the signal itself. Learnt dictionaries are known to outperform predefined dictionaries in several applications. This thesis explores the application of sparse representations of signals obtained by learning overcomplete dictionaries for three applications: 1) classification of images and videos, 2) measurement of similarity between two images, and 3) assessment of perceptual quality of an image. This thesis first capitalizes on the natural discriminative ability of sparse representations to develop efficient classification algorithms. The proposed algorithms are employed in image-based face recognition and video-based human action recognition. They are shown to perform better than the state-of-the-art. The thesis then studies how to obtain a good measure of similarity between two images. Despite the long history of image similarity evaluation, open issues still exist. These include the need of developing generic similarity measures that do not assume any prior knowledge of the task at hand or the data type. This thesis develops a generic image similarity measure based on learning sparse representations. Successful application of the proposed measure to clustering, retrieval and classification of different types of images is demonstrated. The thesis then examines a highly promising approach to assess the perceptual quality of an image. This approach involves comparing the structural information of a possibly distorted image with that in its reference image. The extraction of the structural information that is important to our visual system is a challenging task. A sparse representation-based image quality assessment approach is proposed to address this issue. When compared with seven existing metrics, our method performs the best in three databases and ranks among the top three in the remaining three databases.

Item Media

Item Citations and Data


Attribution-NonCommercial-NoDerivs 2.5 Canada