Problems (1) and (3) have a specific structure that can be exploited to devise efficient algorithms.
Using the Jacobian matrix allows for a simple expression of the gradient and Hessian of f0(x)=21∥R(x)∥22.
Let us compute the gradient and Hessian of f0:
$$
meaning the Hessian can be well-approximated without second derivatives.
We exploit these properties to construct dedicated nonlinear least squares algorithms, known as the Gauss-Newton method and the Levenberg-Marquardt method, respectively.
The Gauss-Newton method can be seen as a quasi-Newton method with matrix Bk=J(x(k))⊤J(x(k)) and constant step-size αk=1.
The Levenberg-Marquardt method can also be interpreted as a quasi-Newton method with constant step-size αk=1. The method introduces a regularization parameter λk, which is tuned throughout iterations.
The Levenberg-Marquardt method is one the standard algorithms for non-linear least squares problems. It is implemented in many libraries, such as scipy.optimize.least_squares (use option method='lm').