BIRS Workshop Lecture Videos

Banff International Research Station Logo

BIRS Workshop Lecture Videos

Rate of convergence of the Nesterov accelerated gradient method in the subcritical case $\alpha \leq 3$ Attouch, Hedy

Description

(<em>Based on a joint work with Z. Chbani and H. Riahi</em>)
In a Hilbert space setting $\mathcal{H}$, given $\Phi: \mathcal H \to \mathbb R$ a convex continuously differentiable function, and $\alpha$ a positive parameter, we first study the asymptotic behavior of the inertial system with Asymptotic Vanishing Damping $$ \mbox{(AVD)}_{\alpha} \quad \quad \ddot{x}(t) + \frac{\alpha}{t} \dot{x}(t) + \nabla \Phi (x(t)) =0, $$ and then the associated inertial algorithms.
Depending on the value of $ \alpha $ with respect to 3, and based on new Lyapunov functions, we give a complete picture of the convergence properties as $t \to + \infty$ of the trajectories generated by $\mbox{(AVD)}_{\alpha}$. As shown by Su-Boyd-Cand&egrave;s, the case $\alpha = 3$ corresponds to a continuous version of the accelerated gradient method of Nesterov, with the convergence rate $\Phi (x(t))-\min \Phi = \mathcal O (t^{-2})$. Our main result concerns the subcritical case $\alpha \leq 3$, where we show that $\Phi (x(t))-\min \Phi = \mathcal O (t^{-\frac{2}{3}\alpha})$. When $\alpha > 3$, we find the recent results by May and A-Chbani-Peypouquet-Redont concerning the weak convergence of the trajectories, and the convergence of the values with the order $o\left(t^{-2} \right)$. This overall picture shows a continuous variation of the rate of convergence of the values $\Phi(x(t))-\min_\mathcal H \Phi= \mathcal O (t^{-p(\alpha)}) $ with respect to $\alpha >0$: the coefficient $p(\alpha)$ increases linearly up to 2 when $\alpha$ goes from $0$ to $3$, then displays a plateau. We also consider the convergence of trajectories in the critical case $ \alpha = $ 3, with a positive response in some particular cases.
Then we consider structured convex minimization problems of the form $\min \left\lbrace \Theta:= \Phi + \Psi \right\rbrace$, with $\Phi$ smooth and $\Psi$ nonsmooth. As a major property, the Lyapunov analysis of the continuous dynamics serves as a guideline for the study of the associated forward-backward inertial algorithms. We obtain a similar convergence rate for the sequence of iterates $(x_k)$: for $\alpha < 3$ we have $\Theta (x_k)-\min \Theta = \mathcal O (k^{-p})$ for all $p <\frac{2\alpha}{3}$, for $\alpha = 3$ \ $\Theta (x_k)-\min \Theta = \mathcal O (k^{-2})$ (FISTA, Beck-Teboulle, 2009), and for $\alpha > 3$ \ $\Theta (x_k)-\min \Theta = o (k^{-2})$ (A-Peypouquet, 2016). We conclude this study by showing that the results are robust with respect to external perturbations.

<dl> <dt>[1]</dt> <dd>H. Attouch, Z. Chbani, J. Peypouquet, P. Redont, <i>Fast convergence of inertial dynamics and algorithms with asymptotic vanishing damping</i>, to appear in Math. Program. DOI: 10.1007/s10107-016-0992-8.</dd> <dt>[2]</dt> <dd>H. Attouch, J. Peypouquet, <i>The rate of convergence of Nesterov's accelerated forward-backward method is actually faster than $\frac{1}{k^2}$</i>, SIAM J. Optim., 26 (2016), No. 3, pp. 1824-1834.</dd> <dt>[3]</dt> <dd> A. Beck, M. Teboulle, <i>A fast iterative shrinkage-thresholding algorithm for linear inverse problems</i>, SIAM J. Imaging Sci., 2 (2009), No. 1, pp. 183-202.</dd> <dt>[4]</dt> <dd> A. Chambolle, C. Dossal, <i>On the convergence of the iterates of the “Fast Iterative Shrinkage/Thresholding Algorithm”</i>, Journal of Optimization Theory and Applications, 166 (2015), pp. 968-982</dd> <dt>[5]</dt> <dd> Y. Nesterov, <i>Introductory lectures on convex optimization: A basic course</i>, volume 87 of Applied Optimization. Kluwer Academic Publishers, Boston, MA, 2004</dd> </dl>

Item Media

Item Citations and Data

Rights

Attribution-NonCommercial-NoDerivatives 4.0 International