Newton’s Method¶
Table of contents
Algorithm Description¶
Newton’s method is used solve to convex optimization problems of the form
where \(f\) is convex and twice differentiable. The algorithm requires both the gradient and Hessian to be known.
The updating rule for Newton’s method is described below. Let \(x^{(i)}\) denote the candidate solution vector at stage \(i\) of the algorithm.
Compute the descent direction using:
\[d^{(i)} = - [H(x^{(i)})]^{-1} [\nabla_x f(x^{(i)})]\]Compute the optimal step size using line search:
\[\alpha^{(i)} = \arg \min_{\alpha} f(x^{(i)} + \alpha d^{(i)})\]Update the candidate solution vector using:
The algorithm stops when at least one of the following conditions are met:
the norm of the gradient vector, \(\| \nabla f \|\), is less than
grad_err_tol
;the relative change between \(x^{(i+1)}\) and \(x^{(i)}\) is less than
rel_sol_change_tol
;the total number of iterations exceeds
iter_max
.
Function Declarations¶
-
bool newton(ColVec_t &init_out_vals, std::function<fp_t(const ColVec_t &vals_inp, ColVec_t *grad_out, Mat_t *hess_out, void *opt_data)> opt_objfn, void *opt_data)¶
Newton’s Nonlinear Optimization Algorithm.
- Parameters
init_out_vals – a column vector of initial values, which will be replaced by the solution upon successful completion of the optimization algorithm.
opt_objfn – the function to be minimized, taking four arguments:
vals_inp
a vector of inputs;grad_out
a vector to store the gradient;hess_out
a matrix to store the Hessian; andopt_data
additional data passed to the user-provided function.
opt_data – additional data passed to the user-provided function.
- Returns
a boolean value indicating successful completion of the optimization algorithm.
-
bool newton(ColVec_t &init_out_vals, std::function<fp_t(const ColVec_t &vals_inp, ColVec_t *grad_out, Mat_t *hess_out, void *opt_data)> opt_objfn, void *opt_data, algo_settings_t &settings)¶
Newton’s Nonlinear Optimization Algorithm.
- Parameters
init_out_vals – a column vector of initial values, which will be replaced by the solution upon successful completion of the optimization algorithm.
opt_objfn – the function to be minimized, taking four arguments:
vals_inp
a vector of inputs;grad_out
a vector to store the gradient;hess_out
a matrix to store the Hessian; andopt_data
additional data passed to the user-provided function.
opt_data – additional data passed to the user-provided function.
settings – parameters controlling the optimization routine.
- Returns
a boolean value indicating successful completion of the optimization algorithm.
Optimization Control Parameters¶
The basic control parameters are:
fp_t grad_err_tol
: the error tolerance value controlling how small the L2 norm of the gradient vector \(\| \nabla f \|\) should be before ‘convergence’ is declared.fp_t rel_sol_change_tol
: the error tolerance value controlling how small the proportional change in the solution vector should be before ‘convergence’ is declared.The relative change is computed using:
\[\left\| \dfrac{x^{(i)} - x^{(i-1)}}{ |x^{(i-1)}| + \epsilon } \right\|_1\]size_t iter_max
: the maximum number of iterations/updates before the algorithm exits.
In addition to these:
int print_level
: Set the level of detail for printing updates on optimization progress.Level
0
: Nothing (default).Level
1
: Print the iteration count and current error values.Level
2
: Level 1 plus the current candidate solution values, \(x^{(i+1)}\).Level
3
: Level 2 plus the direction vector, \(d^{(i)}\), and the gradient vector, \(\nabla_x f(x^{(i+1)})\).Level
4
: Level 3 plus the Hessian matrix, \(H(x^{(i)})\).
Examples¶
Example 1¶
Code to run this example is given below.
Armadillo (Click to show/hide)
#define OPTIM_ENABLE_ARMA_WRAPPERS
#include "optim.hpp"
inline
double
unconstr_test_fn_1_whess(const arma::vec& vals_inp, arma::vec* grad_out, arma::mat* hess_out, void* opt_data)
{
const double x_1 = vals_inp(0);
const double x_2 = vals_inp(1);
double obj_val = 3*x_1*x_1 + 2*x_1*x_2 + x_2*x_2 - 4*x_1 + 5*x_2;
if (grad_out) {
(*grad_out)(0) = 6*x_1 + 2*x_2 - 4;
(*grad_out)(1) = 2*x_1 + 2*x_2 + 5;
}
if (hess_out) {
(*hess_out)(0,0) = 6.0;
(*hess_out)(0,1) = 2.0;
(*hess_out)(1,0) = 2.0;
(*hess_out)(1,1) = 2.0;
}
//
return obj_val;
}
int main()
{
arma::vec x = arma::zeros(2,1);
bool success = optim::newton(x, unconstr_test_fn_1_whess, nullptr);
if (success) {
std::cout << "newton: test completed successfully." << "\n";
} else {
std::cout << "newton: test completed unsuccessfully." << "\n";
}
arma::cout << "newton: solution to test:\n" << x << arma::endl;
return 0;
}
Eigen (Click to show/hide)
#define OPTIM_ENABLE_EIGEN_WRAPPERS
#include "optim.hpp"
inline
double
unconstr_test_fn_1_whess(const Eigen::VectorXd& vals_inp, Eigen::VectorXd* grad_out, Eigen::MatrixXd* hess_out, void* opt_data)
{
const double x_1 = vals_inp(0);
const double x_2 = vals_inp(1);
double obj_val = 3*x_1*x_1 + 2*x_1*x_2 + x_2*x_2 - 4*x_1 + 5*x_2;
if (grad_out) {
(*grad_out)(0) = 6*x_1 + 2*x_2 - 4;
(*grad_out)(1) = 2*x_1 + 2*x_2 + 5;
}
if (hess_out) {
(*hess_out)(0,0) = 6.0;
(*hess_out)(0,1) = 2.0;
(*hess_out)(1,0) = 2.0;
(*hess_out)(1,1) = 2.0;
}
//
return obj_val;
}
int main()
{
Eigen::VectorXd x = Eigen::VectorXd::Zero(2); // initial values (1,1,...,1)
bool success = optim::newton(x, unconstr_test_fn_1_whess, nullptr);
if (success) {
std::cout << "newton: test completed successfully." << "\n";
} else {
std::cout << "newton: test completed unsuccessfully." << "\n";
}
std::cout << "newton: solution to test:\n" << x << std::endl;
return 0;
}