Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Introduction

In this chapter, we will focus on studying convergence properties of unconstrained optimization algorithms. Specifically, we focus on problems of the form

minimizef(x)subject toxRn\begin{array}{ll} \minimize &\quad f(x)\\ \st& \quad x \in \mathbb{R}^n \end{array}

(Note that we will use ff to denote the objective function instead of f0f_0 to simplify the notation in the chapter). We also make two assumptions:

Recall from Chapter 4 the principle of descent methods: starting from an initial point x(0)x^{(0)}, iteratively compute

x(k+1)=x(k)+αkdkx^{(k+1)} = x^{(k)} + \alpha_k d_k

where αk\alpha_k is the stepsize and where dkd_k is a descent direction, i.e., such that dkf(x(k))<0d_k^\top \nabla f(x^{(k)}) < 0.

Two important examples

In this chapter, we will study convergence properties of such algorithms. We will see how properties of the objective function ff impact what can be said about the sequence {x(k)}\lbrace x^{(k)}\rbrace, in terms of type and rate of convergence.

References
  1. Boyd, S. P., & Vandenberghe, L. (2004). Convex optimization. Cambridge University Press.
  2. Nocedal, J., & Wright, S. J. (2006). Numerical optimization (Second Edition). Springer.
  3. Wright, S. J., & Recht, B. (2022). Optimization for data analysis. Cambridge University Press.
  4. Bubeck, S. (2015). Convex optimization: Algorithms and complexity. Foundations and Trends® in Machine Learning, 8(3–4), 231–357.
  5. Nesterov, Y. (2013). Introductory lectures on convex optimization: A basic course (Vol. 87). Springer Science & Business Media.