UBC Theses and Dissertations
On LASSO parameter sensitivity Berk, Aaron
Compressed sensing (CS) is a paradigm in which a structured high-dimensional signal may be recovered from random, under-determined, corrupted linear measurements. LASSO programs are effective for solving CS problems due to their proven ability to leverage underlying signal structure. Three popular LASSO programs are equivalent in a sense and sometimes used interchangeably. Tuned by a governing parameter, each admits an optimal parameter choice yielding minimax order-optimal error. CS is well-studied, though theory for LASSO programs typically concerns this optimally tuned setting. However, the optimal parameter value for a LASSO program depends on properties of the data and is typically unknown in practical settings. Performance in empirical problems thus hinges on a program's parameter sensitivity: it is desirable that small variation about the optimal parameter choice begets small variation about the optimal risk. We examine the risk of three LASSO programs as a function of their governing parameters and further demonstrate that their parameter sensitivity can differ for the same data, thereby informing the selection of a LASSO program in practice. We prove a gauge-constrained program admits asymptotic cusp-like behaviour of its risk in the limiting low-noise regime; a residual-constrained program has asymptotically suboptimal risk for very sparse vectors (i.e., for any fixed parameter choice, the ratio of the risk to the optimally achievable risk grows unbounded). These results contrast observations about an unconstrained program with sufficiently large parameter. Our theory is supported with extensive numerical simulations, demonstrating the parameter sensitivity phenomenon for even modest dimensional parameters. We first analyze these risks for proximal denoising (PD), in which one directly observes signal plus noise (i.e., the measurement matrix is identity). There, we further reveal a data regime in which the unconstrained PD risk can be asymptotically suboptimal. We also show how our theory extends to analyze generalized LASSO programs for generalized CS. Finally, we extend a keystone of our theoretical analysis, the projection lemma. We generalize the result to an arbitrary Hilbert space and extend it from scaled projections to proximal mappings of a dilated gauge. We discuss applications and possible directions for these results.
Item Citations and Data
Attribution-NonCommercial-NoDerivatives 4.0 International