Nonlinear models¶
The abstraction for nonlinear models is independent of both the solver and the user’s representation of the problem, whether using an algebraic modeling language or customized low-level code.
The diagram below illustrates MathProgBase as the connection between typical nonlinear programming (NLP) solvers IPOPT, MOSEK, and KNITRO, and modeling languages such as JuMP and AMPL (via AmplNLReader.jl).
This structure also makes it easy to connect solvers written in Julia itself with user-provided instances in a variety of formats.
We take the prototypical format for a nonlinear problem as
Where \(x \in \mathbb{R}^n, f: \mathbb{R}^n \to \mathbb{R}, g: \mathbb{R}^n \to \mathbb{R}^m\), and vectors \(lb \in (\mathbb{R} \cup \{-\infty\})^m, ub \in (\mathbb{R} \cup \{\infty\})^m,l \in (\mathbb{R} \cup \{-\infty\})^n, u \in (\mathbb{R} \cup \{\infty\})^n\).
The objective function \(f\) and constraint function \(g\) may be nonlinear and nonconvex, but are typically expected to be twice differentiable.
Below we describe the interface for AbstractNonlinearModel
.
-
loadproblem!
(m::AbstractNonlinearModel, numVar, numConstr, l, u, lb, ub, sense, d::AbstractNLPEvaluator)¶ Loads the nonlinear programming problem into the model. The parameter numVar is the number of variables in the problem,
numConstr
is the number of constraints,l
contains the variable lower bounds,u
contains the variable upper bounds,lb
contains the constraint lower bounds, andub
contains the constraint upper bounds. Sense contains the symbol:Max
or:Min
, indicating the direction of optimization. The final parameterd
is an instance of anAbstractNLPEvaluator
, described below, which may be queried for evaluating \(f\) and \(g\) and their corresponding derivatives.
The abstract type AbstractNLPEvaluator
is used by solvers for accessing the objective function \(f\) and constraints \(g\). Solvers may query the value, gradients, Hessian-vector products, and the Hessian of the Lagrangian.
-
initialize
(d::AbstractNLPEvaluator, requested_features::Vector{Symbol})¶ Must be called before any other methods. The vector
requested_features
lists features requested by the solver. These may include:Grad
for gradients of \(f\),:Jac
for explicit Jacobians of \(g\),:JacVec
for Jacobian-vector products,:HessVec
for Hessian-vector and Hessian-of-Lagrangian-vector products,:Hess
for explicit Hessians and Hessian-of-Lagrangians, and:ExprGraph
for expression graphs.
-
features_available
(d::AbstractNLPEvaluator)¶ Returns the subset of features available for this problem instance, as a list of symbols in the same format as in
initialize
.
-
eval_f
(d::AbstractNLPEvaluator, x)¶ Evaluate \(f(x)\), returning a scalar value.
-
eval_g
(d::AbstractNLPEvaluator, g, x)¶ Evaluate \(g(x)\), storing the result in the vector
g
which must be of the appropriate size.
-
eval_grad_f
(d::AbstractNLPEvaluator, g, x)¶ Evaluate \(\nabla f(x)\) as a dense vector, storing the result in the vector
g
which must be of the appropriate size.
-
jac_structure
(d::AbstractNLPEvaluator)¶ Returns the sparsity structure of the Jacobian matrix \(J_g(x) = \left[ \begin{array}{c} \nabla g_1(x) \\ \nabla g_2(x) \\ \vdots \\ \nabla g_m(x) \end{array}\right]\) where \(g_i\) is the \(i\text{th}\) component of \(g\). The sparsity structure is assumed to be independent of the point \(x\). Returns a tuple
(I,J)
whereI
contains the row indices andJ
contains the column indices of each structurally nonzero element. These indices are not required to be sorted and can contain duplicates, in which case the solver should combine the corresponding elements by adding them together.
-
hesslag_structure
(d::AbstractNLPEvaluator)¶ Returns the sparsity structure of the Hessian-of-the-Lagrangian matrix \(\nabla^2 f + \sum_{i=1}^m \nabla^2 g_i\) as a tuple
(I,J)
whereI
contains the row indices andJ
contains the column indices of each structurally nonzero element. These indices are not required to be sorted and can contain duplicates, in which case the solver should combine the corresponding elements by adding them together. Any mix of lower and upper-triangular indices is valid. Elements(i,j)
and(j,i)
, if both present, should be treated as duplicates.
-
eval_jac_g
(d::AbstractNLPEvaluator, J, x)¶ Evaluates the sparse Jacobian matrix \(J_g(x) = \left[ \begin{array}{c} \nabla g_1(x) \\ \nabla g_2(x) \\ \vdots \\ \nabla g_m(x) \end{array}\right]\). The result is stored in the vector
J
in the same order as the indices returned byjac_structure
.
-
eval_jac_prod
(d::AbstractNLPEvaluator, y, x, w)¶ Computes the Jacobian-vector product \(J_g(x)w\), storing the result in the vector
y
.
-
eval_jac_prod_t
(d::AbstractNLPEvaluator, y, x, w)¶ Computes the Jacobian-transpose-vector product \(J_g(x)^Tw\), storing the result in the vector
y
.
-
eval_hesslag_prod
(d::AbstractNLPEvaluator, h, x, v, σ, μ)¶ Given scalar weight
σ
and vector of constraint weightsμ
, computes the Hessian-of-the-Lagrangian-vector product \(\left(\sigma\nabla^2 f(x) + \sum_{i=1}^m \mu_i \nabla^2 g_i(x)\right)v\), storing the result in the vectorh
.
-
eval_hesslag
(d::AbstractNLPEvaluator, H, x, σ, μ)¶ Given scalar weight
σ
and vector of constraint weightsμ
, computes the sparse Hessian-of-the-Lagrangian matrix \(\sigma\nabla^2 f(x) + \sum_{i=1}^m \mu_i \nabla^2 g_i(x)\), storing the result in the vectorH
in the same order as the indices returned byhesslag_structure
.
-
isobjlinear
(d::AbstractNLPEvaluator)¶ true
if the objective function is known to be linear,false
otherwise.
-
isobjquadratic
(d::AbstractNLPEvaluator)¶ true
if the objective function is known to be quadratic (convex or nonconvex),false
otherwise.
-
isconstrlinear
(d::AbstractNLPEvaluator, i)¶ true
if the \(i\text{th}\) constraint is known to be linear,false
otherwise.
-
obj_expr
(d::AbstractNLPEvaluator)¶ Returns an expression graph for the objective function as a standard Julia
Expr
object. All sums and products are flattened out as simpleExpr(:+,...)
andExpr(:*,...)
objects. The symbolx
is used as a placeholder for the vector of decision variables. No other undefined symbols are permitted; coefficients are embedded as explicit values. For example, the expression \(x_1+\sin(x_2/\exp(x_3))\) would be represented as the Julia object:(x[1] + sin(x[2]/exp(x[3])))
. See the Julia manual for more information on the structure ofExpr
objects. There are currently no restrictions on recognized functions; typically these will be built-in Julia functions like^
,exp
,log
,cos
,tan
,sqrt
, etc., but modeling interfaces may choose to extend these basic functions.
-
constr_expr
(d::AbstractNLPEvaluator, i)¶ Returns an expression graph for the \(i\text{th}\) constraint in the same format as described above. The head of the expression is
:comparison
, indicating the sense of the constraint. The right-hand side of the comparison must be a constant; that is,:(x[1]^3 <= 1)
is allowed, while:(1 <= x[1]^3)
is not valid. Double-sided constraints are allowed, in which case both the lower bound and upper bounds should be constants; for example,:(-1 <= cos(x[1]) + sin(x[2]) <= 1)
is valid.
Nonlinear solvers may also provide optimal Lagrange multipliers if available through getreducedcosts
and getconstrduals
.
-
getreducedcosts
(m::AbstractNonlinearModel)¶ Returns the dual solution vector corresponding to the variable bounds, known as the reduced costs. Not available when integer variables are present.
-
getconstrduals
(m::AbstractNonlinearModel)¶ Returns the dual solution vector corresponding to the constraints. Not available when integer variables are present.