UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Statistical inference on structured data : sequences and graphs Wang, Ziao

Abstract

In the era of big data, structured data, such as sequences and graphs, have become central to various scientific and engineering disciplines. This dissertation explores statistical inference techniques tailored to these complex data structures, focusing on the development of novel methodologies that address the challenges posed by sequences and graphs. The first part of the dissertation addresses the inference of sequences, with a particular focus on the noisy computing problem. We investigate several fundamental computing tasks under an active learning setting, including finding the total ordering or the largest element of a real sequence through noisy pairwise comparisons and determining whether the weight of a binary sequence exceeds a certain threshold based on noisy bit readings. To characterize optimal query complexity, we propose novel query strategies and develop statistical lower bounds, which tighten the existing bounds for all these problems. The second part of the dissertation studies the inference on graphs, specifically focusing on the graph alignment problem. Graph alignment refers to the task of finding the vertex correspondence between two correlated graphs. We explore a variant of this problem known as attributed graph alignment, where additional attribute information is leveraged to aid the alignment process. We propose new polynomial-time algorithms for this problem and provide a theoretical analysis of their performance. Our results demonstrate that these algorithms can successfully identify the correct vertex correspondence under a weak correlation between the two graphs.

Item Media

Item Citations and Data

Rights

Attribution-NonCommercial-NoDerivatives 4.0 International