UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

A comparative analysis of multi-dimensional indexing structures for "eigenimages" Sedighian, Andishe Samandari

Abstract

Content-based retrieval in image management systems requires indexing of image feature vectors. Most feature vectors have a high number of dimensions (>20). This makes indexing difficult since most existing multi-dimensional , indexing structures grow exponentially in size as dimensions increase. We approach this problem in three stages: i) reduce the dimensionality of the feature space, ii) evaluate existing multi-dimensional indexing structures to determine which can best organize the new feature space, and iii) customize one of the selected structures to improve search performance. To reduce the dimensionality of the feature space without losing much information we apply a statistical technique called Principal Component Analysis (PCA), using Turk and Pentland's eigenfaces approach. We then conduct a comparative analysis of a wide range of existing multi-dimensional indexing structures (quad trees, KD-trees, R-trees, gridfile, and multipaging), selecting three of them (bucket adaptive KD-tree, gridfile, R-tree) for further empirical comparisons. Tests show that the bucket adaptive KD-tree uses the least storage and peforms the best during search. Finally, we customize the bucket adaptive KD-tree by implementing techniques that take advantage of the characteristics of the transformed space — namely ranked dimensions by decreasing variance and known dynamic ranges. This prunes the search space and results in very efficient searches. The number of page accesses are reduced significantly, some times leading to savings as high as 70%.

Item Media

Item Citations and Data

Rights

For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use.