Distributional Robustness and Regularization in Statistical Learning

BIRS Workshop Lecture Videos

Featured Collection

BIRS Workshop Lecture Videos

Distributional Robustness and Regularization in Statistical Learning Kleywegt, Anton

Description

A central problem in statistical learning is to design prediction algorithms that not only perform well on training data, but also perform well on new and unseen, but similar, data. We approach this problem by formulating a distributionally robust stochastic optimization (DRSO) problem, which seeks a solution that minimizes the worst-case expected loss over a family of distributions that are close to the empirical distribution as measured by Wasserstein distance. We establish a connection between such Wasserstein DRSO and regularization. Specifically, we identify a broad class of loss functions, for which the Wasserstein DRSO is asymptotically equivalent to a regularization problem with a gradient-norm penalty. Such relation provides a new interpretation for approaches that use regularization, including a variety of statistical learning problems and discrete choice models. The connection also suggests a principled way to regularize high-dimensional, non-convex problems, which is demonstrated with the training of Wasserstein generative adversarial networks (WGANs) in deep learning. This is joint work with Rui Gao and Xi Chen.

Item Metadata

Title	Distributional Robustness and Regularization in Statistical Learning
Creator	Kleywegt, Anton
Publisher	Banff International Research Station for Mathematical Innovation and Discovery
Date Issued	2018-03-06T09:51
Description	A central problem in statistical learning is to design prediction algorithms that not only perform well on training data, but also perform well on new and unseen, but similar, data. We approach this problem by formulating a distributionally robust stochastic optimization (DRSO) problem, which seeks a solution that minimizes the worst-case expected loss over a family of distributions that are close to the empirical distribution as measured by Wasserstein distance. We establish a connection between such Wasserstein DRSO and regularization. Specifically, we identify a broad class of loss functions, for which the Wasserstein DRSO is asymptotically equivalent to a regularization problem with a gradient-norm penalty. Such relation provides a new interpretation for approaches that use regularization, including a variety of statistical learning problems and discrete choice models. The connection also suggests a principled way to regularize high-dimensional, non-convex problems, which is demonstrated with the training of Wasserstein generative adversarial networks (WGANs) in deep learning. This is joint work with Rui Gao and Xi Chen.
Extent	48 minutes
Subject	Mathematics; Operations research, mathematical programming; Probability theory and stochastic processes; Operation research
Type	Moving Image
File Format	video/mp4
Language	eng
Notes	Author affiliation: Georgia Institute of Technology
Series	BIRS Workshop Lecture Videos (Banff, Alta)
Date Available	2018-09-02
Provider	Vancouver : University of British Columbia Library
Rights	Attribution-NonCommercial-NoDerivatives 4.0 International
DOI	10.14288/1.0371887
URI	http://hdl.handle.net/2429/67071
Affiliation	Non UBC
Peer Review Status	Unreviewed
Scholarly Level	Faculty
Rights URI	http://creativecommons.org/licenses/by-nc-nd/4.0/
Aggregated Source Repository	DSpace

Item Media

201803060951-Kleywegt_lrv.mp4 -- 163.75MB

Item Citations and Data

Rights

Attribution-NonCommercial-NoDerivatives 4.0 International

Open Collections

BIRS Workshop Lecture Videos

Distributional Robustness and Regularization in Statistical Learning Kleywegt, Anton

Description

Item Metadata

Item Media

Item Citations and Data

Rights