If you see this, something is wrong

LaTex2Web logo

LaTeX2Web, a web authoring and publishing system

LaTex2Web logo

LaTeX2Web, a web authoring and publishing system

Table of contents

Digest for "Sensitivity analysis for parametric nonlinear programming: A tutorial"

Abstract

This tutorial provides an overview of the current state-of-the-art in the sensitivity analysis for nonlinear programming. Building upon the fundamental work of Fiacco, it derives the sensitivity of primal-dual solutions for regular nonlinear programs and explores the extent to which Fiacco’s framework can be extended to degenerate nonlinear programs with non-unique dual solutions. The survey ends with a discussion on how to adapt the sensitivity analysis for conic programs and approximate solutions obtained from interior-point algorithms.

Definition 1 (Critical cone)

For a feasible point \( x \in X(p)\) , the critical cone associated to the feasible set \( X(p)\) at \( x\) is defined as

\[ \begin{equation} \begin{aligned} \mathcal{C}_p(x) = \Big\{ d \in \mathbb{R}^{n} \; : \; & \nabla_{x} f(x, p)^\top d \leq 0 \;, \\ & \nabla_{x} g_i(x, p)^\top d = 0 \;,\forall i \in [m_e] \;, \\ & \nabla_{x} h_j(x, p)^\top d \leq 0 \; , \forall j \in \mathcal{I}_p(x) \Big\} \; . \end{aligned} \end{equation} \]

(13)

Definition 2 (Critical set)

Let \( (x^\star, y^\star, z^\star)\) be a KKT stationary solution. The critical set at the solution is the polyhedral cone defined as

\[ \begin{equation} \begin{aligned} \mathcal{K}_p(x^\star, z^\star) = \{ d \in \mathbb{R}^{n + \ell} \; : \; & \nabla_{(x,p)} g_i(x^\star, p)^\top d = 0 \;, \forall i \in [m_e] \,,\\ & \nabla_{(x,p)} h_j(x^\star, p)^\top d = 0 \; , \forall j \in \mathcal{I}_p^+(x^\star; z^\star) \,, \\ & \nabla_{(x,p)} h_j(x^\star, p)^\top d \leq 0 \; , \forall j \in \mathcal{I}_p^0(x^\star; z^\star) \} \; . \end{aligned} \end{equation} \]

(16)

Elements of \( \mathcal{K}_p(x^\star, z^\star)\) are called critical directions. The directional critical set in the direction \( h \in \mathbb{R}^{\ell}\) is defined as

\[ \begin{equation} \mathcal{K}_p(x^\star, z^\star; h) = \{ d \in \mathbb{R}^{n} \; : \; (d, h) \in \mathcal{K}_p(x^\star, z^\star) \} \;. \end{equation} \]

(17)

Definition 3 (Linear-independence constraint qualification)

LICQ holds if the gradients of active constraints

\[ \begin{equation} \{\nabla_x g_i(x, p) \; : \; i \in [m_e]\}~\cup~ \{\nabla_x h_i(x, p) \; : \; i \in \mathcal{I}_p(x) \} \; , \end{equation} \]

(18)

are linearly independent.

Theorem 1

Suppose \( (x^\star, y^\star, z^\star)\) is a primal-dual solution of (4) satisfying LICQ. Then,

\[ \begin{equation} d^\top \nabla^2_{x x} L(x, y, z, p) d \geq 0 ~ \forall d \in \mathcal{C}_p(x) \; , \end{equation} \]

(19)

where \( \mathcal{C}_p(x)\) is the critical cone of the feasible set \( X(p)\) at \( x\) .

Definition 4 (Mangasarian-Fromovitz constraint qualification)

MFCQ holds if the gradients of the equality constraints

\[ \begin{equation} \{\nabla_x g_i(x, p) \; : \; i \in [m_e]\} \; , \end{equation} \]

(20)

are linearly independent and if there exists a vector \( d \in \mathbb{R}^{n_x}\) such that

\[ \begin{equation} \nabla_x g_i(x, p)^\top d = 0 \;, \forall i \in [m_e] \, , ~ \nabla_x h_i(x, p)^\top d < 0 \;, \forall i \in \mathcal{I}_p(x) \, . \end{equation} \]

(21)

Theorem 2 ([1])

The set \( \mathcal{M}_p(x)\) is bounded if and only if MFCQ holds.

Proposition 1 (Dual MFCQ)

Using the theorem of alternatives, MFCQ is equivalent to \( 0\) being the unique solution of the linear system

\[ \begin{equation} \begin{aligned} & \sum_{i=1}^{m_e} y_i \nabla_x g_i(x, p) + \sum_{i\in \mathcal{I}_p(x)} z_i \nabla_x h_i(x, p) = 0 \; , \\ & z_i \geq 0 ~ \forall i \in \mathcal{I}_p(x) \; . \end{aligned} \end{equation} \]

(22)

Definition 5 (Strict Mangasarian-Fromovitz constraint qualification)


SMFCQ holds if

\[ \begin{equation} \{\nabla_x g_i(x, p) \; : \; i \in [m_e]\}~\cup~ \{\nabla_x h_i(x, p) \; : \; i \in \mathcal{I}_p^+(x; z) \} \;, \end{equation} \]

(23)

are linearly independent and if there exists a vector \( d \in \mathbb{R}^{n_x}\) such that

\[ \begin{equation} \begin{aligned} & \nabla_x g_i(x, p)^\top d = 0 \;, \forall i \in [m_e] \, , \\ & \nabla_x h_i(x, p)^\top d = 0 \;, \forall i \in \mathcal{I}_p^+(x; z) \, , \\ & \nabla_x h_i(x, p)^\top d < 0 \;, \forall i \in \mathcal{I}_p^0(x; z) \, . \end{aligned} \end{equation} \]

(24)

Proposition 2 (Proposition 1.1, [2])

Let \( w = (x, y, z)\) be a primal-dual solution. Then SMFCQ holds at \( x\) if and only if the multiplier vector \( (y, z)\) is unique.

Definition 6 (Constant-rank constraint qualification (CRCQ))

CRCQ holds if there exists a neighborhood \( W\) of \( x^\star\) such that for any subsets \( I\) of \( \mathcal{I}_p(x^\star)\) and \( J\) of \( [m_e]\) the family of gradient vectors

\[ \begin{equation} \{\nabla_x g_j(x, p) \; : \; j \in J\}~\cup~ \{\nabla_x h_i(x, p) \; : \; i \in I \} \; , \end{equation} \]

(25)

has the same rank for all vectors \( x \in W\) .

Definition 7 (Second-order sufficiency condition (SOSC))

SOSC holds at \( x\) if there exists \( (y, z) \in \mathcal{M}_p(x)\) such that

\[ \begin{equation} d^\top \nabla^2_{x x} L(x, y, z, p) d > 0 ~ \forall d \in \mathcal{C}_p(x) \;, \;d \neq 0 \;. \end{equation} \]

(26)

Proposition 3

Suppose SOSC holds at a solution \( x^\star\) . Then, there exists \( \sigma > 0\) and a neighborhood \( V\) of \( x^\star\) such that for all \( x \in X(p) \cap V\) (with \( x \neq x^\star\) ),

\[ \begin{equation} f(x) > f(x^\star) + \frac{\sigma}{2} \|x - x^\star \|^2 \; . \end{equation} \]

(27)

Definition 8 (Strong second-order sufficiency condition (SSOSC))

SSOSC holds at \( x\) if there exists \( (y, z) \in \mathcal{M}_p(x)\) such that

\[ \begin{equation} d^\top \nabla^2_{x x} L(x, y, z, p) d > 0 \; ~ \forall d \in \mathcal{D}_p(x; z) \;, d \neq 0 \;. \end{equation} \]

(29)

If the condition (29) hold for all multipliers in \( \mathcal{M}_p(x)\) , then we say that the generalized second-order sufficiency condition (GSSOSC) hold.

Theorem 3 (Implicit function theorem)

Let \( F: \mathbb{R}^{n} \times \mathbb{R}^{\ell} \to \mathbb{R}^{n}\) be a \( C^1\) function and \( (x^\star, p^\star) \in \mathbb{R}^{n} \times \mathbb{R}^{\ell}\) such that \( F(x^\star, p^\star) = 0\) . If the Jacobian \( J_x F(x^\star, p^\star)\) is invertible, then there exists a neighborhood \( U \subset \mathbb{R}^{\ell}\) of \( p^\star\) and a differentiable function \( x(p)\) such that \( F(x(p), p)= 0\) for all \( p \in U\) , and \( x(p)\) is the unique solution in a neighborhood of \( x^\star\) . In addition, \( x(\cdot)\) is \( C^1\) differentiable on \( U\) , with

\[ \begin{equation} J_p x(p) = -\big(J_x F(x, p)\big)^{-1} J_p F(x, p) \; . \end{equation} \]

(30)

Theorem 4 (Theorem 3.2.2 [3])

Suppose that at a KKT stationary solution \( x^\star\) of NLP\( (p^\star)\) , SOSC, LICQ and SCS hold. Then,

  1. the primal-dual solution \( w^\star = (x^\star, y^\star, z^\star_\mathcal{I})\) is unique;
  2. there exists a unique continuously differentiable function \( w(\cdot)\) defined in a neighborhood \( U\) of \( p^\star\) such that \( w(p^\star) =(x^\star, y^\star, z^\star_\mathcal{I})\) , with

    \[ \begin{equation} J_p w(p) = - M^{-1} N \; . \end{equation} \]

    (34)

  3. SCS and LICQ hold locally near \( p^\star\) .

Remark 1 (Linear programming)

If the problem (4) is a linear program (LP), we have \( \nabla_{xx}^2 \mathcal{L} = 0\) . The Goldman-Tucker theorem states there always exists a LP solution satisfying strict-complementarity. If the LP is not degenerate, LICQ holds and the solution is a vertex of the polyhedral feasibility set \( X(p)\) . This automatically implies SOCP (the critical cone reduces to \( \mathcal{C}_p(x^\star) = \{0 \}\) ) and there are exactly \( n\) binding constraints: \( m_e + | \mathcal{I}_p(x^\star) | = n\) . Hence Theorem 4 applies to the linear programming case. We denote the active Jacobian at the solution \( B = \begin{bmatrix} J_x g^\top & J_x h_\mathcal{I}^\top \end{bmatrix}^\top\) . The matrix \( B\) has dimension \( n \times n\) and is non-singular (LICQ implies it is full row-rank). We note that \( B\) is exactly the basis matrix used in the simplex algorithm. In that case, the inverse of the matrix \( M = \begin{bmatrix} 0 & B^\top \\ B & 0 \end{bmatrix}\) in (34) satisfies:

\[ \begin{equation} M^{-1} = \begin{bmatrix} 0 & B^{-1} \\ B^{-\top} & 0 \end{bmatrix} \; . \end{equation} \]

(36)

Hence, the computation of the sensitivities simplifies considerably if the problem is regular linear program. If the linear program is solved using the simplex algorithm, the solver usually returns a LU factorization for the basis matrix \( B\) .

Remark 2 (Exponential decay)

Certain optimization problems have a graph structure: in that case the primal-dual variable can be reordered following a set of nodes \( \mathcal{V}\) such that \( w = \{w_i \}_{i \in \mathcal{V}}\) . As a consequence, the Jacobian matrix \( M\) appearing in (34) can also be reordered following the graph structure: in the language of linear algebra, there exists a permutation matrix \( P\) such that the matrix \( P^\top M P\) is banded. It is known that terms in the inverse of a banded matrix decay exponentially fast away from the diagonal [4]. Then, assuming SSOSC, SCS and LICQ the solution of (34) satisfies the exponential decay of sensitivity property [5], that is, for two parameters \( p, p'\) close enough,

\[ \begin{equation} \| w_i(p) - w(p') \| \leq \sum_{j \in \mathcal{V}} C_{ij} \| p_j - p_j' \| \; , ~ \forall i \in \mathcal{V} \; , \end{equation} \]

(37)

with \( w_i(p)\) the primal-dual nodal solution at node \( i \in \mathcal{V}\) and \( C_{ij} = \Gamma \rho^{d_G(i, j)}\) (with \( \Gamma > 0\) and \( \rho > 0\) two positive constants) and \( d_G(i, j)\) the graph distance between node \( i\) and node \( j\) . The Equation (37) has important consequence for computing the sensitivities in a decentralized fashion [6] or building a preconditionner to solve (34) with an iterative method [7].

Proposition 4 (KKT system as generalized equation)

Let \( w = (x, y, z) \in \mathbb{R}^{n} \times \mathbb{R}^{m_e} \times \mathbb{R}^{m_i}\) be a primal-dual variable. A primal-dual solution \( w\) of the KKT system (9) is also solution of the generalized equation

\[ \begin{equation} F(w, p) + N_C(w) \ni 0 \; , \end{equation} \]

(38)

with \( F(w, p) := \begin{bmatrix} \phantom{-}\nabla_x \mathcal{L}(x, y, z, p) \\ -\nabla_y \mathcal{L}(x, y, z, p) \\ -\nabla_z \mathcal{L}(x, y, z, p) \end{bmatrix}\) and \( C = \mathbb{R}^{n} \times \mathbb{R}^{m_e} \times \mathbb{R}^{m_i}_+\) .

Proposition 5 (Theorem 4.1, [8])

Let \( w^\star = (x^\star, y^\star, z^\star)\) a primal-dual solution of the generalized equation (38). If SSOSC and LICQ hold at \( w^\star\) , then (38) is strongly regular at \( w^\star\) .

Proposition 6 (Theorem 2, [9])

Suppose that at a KKT stationary solution \( x^\star\) of NLP\( (p^\star)\) , LICQ and SSOSC hold. Then

  • there exists a local unique continuous primal-dual solution \( w(\cdot)\) which is directionally differentiable in every direction \( h\in \mathbb{R}^{\ell}\) ;
  • for a given direction \( h \in \mathbb{R}^{\ell}\) , the directional derivative \( w'(p; h)\) is the primal-dual solution of the QP problem defined as

    \[ \begin{equation} \begin{aligned} \min_{d}\; & \frac{1}{2} d^\top (\nabla^2_{x x}\mathcal{L}) \, d + h^\top (\nabla^2_{x p}\mathcal{L}) \,d \\ \text{s.t.} ~ & \nabla_x g_i(x, p)^\top d + \nabla_p g_i(x, p)^\top h = 0 & (\forall i \in [m_e]) \,,\\ & h_j(x, p) + \nabla_x h_j(x, p)^\top d + \nabla_p h_j(x, p)^\top h \leq 0 & (\forall j \in [m_i])\,, \end{aligned} \end{equation} \]

    (39)

    with \( \nabla_{x x}^2 \mathcal{L} = \nabla_{x x}^2 \mathcal{L}(x, y, z, p)\) , \( \nabla_{x p}^2 \mathcal{L} = \nabla_{xp}^2 \mathcal{L}(x, y, z, p)\) .

Proposition 7 (Theorem 5.1 [10])

Let \( (x^\star, y^\star, z^\star)\) be a solution of KKT. If \( \Phi\) is coherently oriented with respect to \( (x, y, z)\) at \( (x^\star, y^\star, z^\star)\) then there exists a neighborhood \( U\) of \( p\) and \( V\) of \( x^\star\) and a \( PC^1\) mapping \( w: U \to V\) such that for each \( p \in U\) \( w(p)\) is the unique solution of (40). Moreover, for any \( k \in \mathbb{N}\) and any \( R \in \mathbb{R}^{p \times k}\) the LD-derivatives \( x'(p; R)\) , \( y'(p, R)\) and \( z'(p, R)\) are the unique solution \( X\) , \( Y\) and \( Z = [Z^+ Z^0 Z^-]\) of the following nonsmooth equation system

\[ \begin{equation} \begin{aligned} \begin{bmatrix} \nabla^2_{x x} \mathcal{L} & G_x^\top & (H_x^+)^\top \\ G_x & 0 & 0 \\ H_x^+ & 0 & 0 \end{bmatrix} \begin{bmatrix} X \\ Y \\ Z^+ \end{bmatrix} = - \begin{bmatrix} \nabla^2_{xp} \mathcal{L} \\ G_p \\ H_p^+ \end{bmatrix} R \; , \\ \textbf{LMmin}(-H_p^0 R - H_x^0 X, Z^0) = 0 \; , \\ Z^- = 0 \;, \end{aligned} \end{equation} \]

(41)

where \( \bf{LMmin}\) is the lexicographic matrix minimum function defined in (108), and

\[ G_{(x,p)} = J_{(x, p)} g(x, p) \; , ~ H_{(x,p)}^+ = J_{(x, p)} h_{\mathcal{I}_p^+}(x, p) \; , ~ H_{(x,p)}^0 = J_{(x, p)} h_{\mathcal{I}_p^0}(x, p) \; . \]

Proposition 8

Let \( (x^\star, y^\star, z^\star)\) be a KKT stationary point. Suppose that LICQ and SSOSC holds at \( (x^\star, y^\star, z^\star)\) . Then \( \Phi\) is coherently oriented with respect to \( (x, y, z)\) .

Proposition 9 (Theorem 3.1 [11])

Let \( (x^\star, y^\star, z^\star)\) be a solution of KKT. Suppose that LICQ and SSOSC hold. Then, for any \( k \in \mathbb{N}\) and \( R \in \mathbb{R}^{\ell \times k}\) , the LD-derivatives \( x'(p, R)\) , \( y'(p, R)\) and \( z'(p, R)\) in the direction \( R\) are given as

\[ \begin{equation} \begin{aligned} & x'(p, R) = \big[ d_{(1)}(r_{(1)}) ~ d_{(2)}(r_{(2)}) ~ \cdots~ d_{(k)}(r_{(k)}) \big] \; , \\ & y'(p, R) = \big[ \alpha_{(1)}(r_{(1)}) ~ \alpha_{(2)}(r_{(2)}) ~ \cdots~ \alpha_{(k)}(r_{(k)}) \big] \; , \\ & z'(p, R) = \big[ \mu_{(1)}(r_{(1)}) ~ \mu_{(2)}(r_{(2)}) ~ \cdots~ \mu_{(k)}(r_{(k)}) \big] \; , \end{aligned} \end{equation} \]

(43)

with, for \( j = 1, …, k\) ,

\[ \begin{equation} \begin{aligned} & \mu_{(j)}(r_{(j)})_{\mathcal{A}^+_{(j)}} = \beta_{(j)}(r_{(j)})\;, \\ & \mu_{(j)}(r_{(j)})_{\mathcal{A}^0_{(j)}} = \gamma_{(j)}(r_{(j)}) \;,\\ & \mu_{(j)}(r_{(j)})_{\mathcal{A}^-_{(j)}} = 0 \; . \end{aligned} \end{equation} \]

(44)

Theorem 5 (Theorem 7.2, [12])

Suppose that at a KKT stationary solution \( x^\star\) of NLP\( (p^\star)\) , MFCQ and GSSOSC hold. Then, there are open neighborhoods \( U\) of \( p^\star\) and \( V\) of \( x^\star\) and a function \( x: U \to V\) such that \( x(\cdot)\) is continuous. The function \( x(p)\) is the unique local solution of NLP(\( p\) ) in \( V\) , and MFCQ holds at \( x(p)\) .

Proposition 10 (Theorem 4.2, [13])

Suppose that at a solution \( x^\star\) of NLP\( (p^\star)\) , SMFCQ and SSOSC hold. Then, \( x(\cdot)\) is directionally differentiable at \( p\) for every \( h \in \mathbb{R}^{\ell}\) , and the directional derivative \( x'(p^\star, h)\) is the unique solution of the QP problem

\[ \begin{equation} \min_{d}\; \frac{1}{2} d^\top (\nabla^2_{x x}\mathcal{L}) \, d + h^\top (\nabla^2_{x p} \mathcal{L}) \, d ~ \text{s.t.} ~ d \in \mathcal{K}_{p^\star}(x^\star, z^\star; h) \;, \end{equation} \]

(45)

with \( \mathcal{K}_{p^\star}\) the critical cone defined in (13) and \( \nabla_{x x}^2 \mathcal{L} = \nabla_{x x}^2 \mathcal{L}(x^\star, y^\star, z^\star, p^\star)\) , \( \nabla_{x p}^2 \mathcal{L} = \nabla_{x p}^2 \mathcal{L}(x^\star, y^\star, z^\star, p^\star)\) .

Proposition 11 (Theorem 2.2, [14])

Suppose that at a KKT stationary solution \( x^\star\) of NLP\( (p^\star)\) , MFCQ, CRCQ and GSSOSC hold. Then, \( x(\cdot)\) is directionally differentiable at \( p^\star\) for every \( h \in \mathbb{R}^{\ell}\) , and the directional derivative \( x'(p^\star, h)\) is the unique solution of the QP problem, defined for a given \( (y, z) \in \mathcal{E}_{p^\star}(x^\star)\) ,

\[ \begin{equation} \min_{d}\; \frac{1}{2} d^\top (\nabla_{x x}^2 \mathcal{L}) \, d + h^\top (\nabla_{x p}^2 \mathcal{L}) d ~ \text{s.t.} ~ d \in \mathcal{K}_{p^\star}(x^\star, z; h) \;, \end{equation} \]

(46)

with \( \mathcal{K}_{p^\star}\) the critical cone defined in (13) and \( \nabla_{x x}^2 \mathcal{L} = \nabla_{x x}^2 \mathcal{L}(x^\star, y, z, p^\star)\) , \( \nabla_{x p}^2 \mathcal{L} = \nabla_{x p}^2 \mathcal{L}(x^\star, y, z, p^\star)\) .

Lemma 1 (Lemma 2.2, [15])

The critical cone \( \mathcal{K}_p(x^\star, z; h)\) is nonempty if and only if \( (y, z) \in S_p(x^\star; h)\) .

Theorem 6 (Theorem 2, [16])

Suppose that at a KKT stationary solution \( x^\star\) of NLP\( (p^\star)\) , MFCQ, CRCQ and GSSOSC hold. Then,

  • There is open neighborhoods \( U\) of \( p^\star\) and \( V\) of \( x^\star\) and a function \( x:U \to V\) such that \( x(p)\) is the unique local solution of NLP(\( p\) in \( V\) and MFCQ holds at \( x(p)\) ;
  • \( x(\cdot)\) is a PC\( ^1\) function (hence locally Lipschitz and B-differentiable);
  • For each \( p \in U\) , the directional derivative \( x'(p; \cdot)\) is a piecewise linear function, and for \( h \in \mathbb{R}^{\ell}\) and \( (y, z) \in S_p(x^\star; h)\) , \( x'(p; h)\) is the unique solution of (46).

Theorem 7 (Danskin’s Lemma [17])

Suppose \( X\) is nonempty and compact, and \( f(x, \cdot)\) is continuously differentiable at \( p \in \mathbb{R}^{\ell}\) . Then \( \varphi\) is locally Lipschitz near \( p\) , and directionally differentiable in every direction \( h \in \mathbb{R}^{\ell}\) , with

\[ \begin{equation} \varphi'(p; h) = \min_{x \in \Sigma(p)} \; \nabla_p f(x, p)^\top h \; , \end{equation} \]

(53)

with \( \Sigma(p)\) the solution set defined in (7).

Theorem 8 ([18])

Suppose \( M\) is a closed convex set, \( f(\cdot, p)\) and \( h_i(\cdot, p)\) are convex on \( M\) for all \( p\) , and continuously differentiable on \( M \times N\) , with \( N\) a neighborhood of \( p\) . If (i) the solution set \( \Sigma(p)\) is nonempty and bounded, (ii) \( \varphi(p)\) is finite, (iii) there is a point \( \hat{x} \in M\) such that \( h(\hat{x}, p) < 0\) (Slater), then \( \varphi\) is directionally differentiable, and for all \( h\in \mathbb{R}^{\ell}\) ,

\[ \begin{equation} \varphi'(p; h) = \min_{x \in \Sigma(p)} \max_{z \in \mathcal{M}_p(x)} \nabla_p \mathcal{L}(x, z, p)^\top h \;, \end{equation} \]

(54)

with \( \mathcal{M}_p(x)\) the multiplier solution set.

Theorem 9 (Theorem 2.3.4 [19])

Suppose that \( X(p)\) is nonempty and uniformly compact near \( p\) and MFCQ holds at each \( x \in \Sigma(p)\) . Then \( \varphi\) is locally Lipschitz near \( p\) , and for any \( h \in \mathbb{R}^{\ell}\) ,

\[ \begin{equation} \begin{aligned} \inf_{x \in \Sigma(p)} \min_{(y, z) \in \mathcal{M}_p(x)} \nabla_p \mathcal{L}(x, y, z, p)^\top h & \leq \lim_{t \downarrow 0} \inf \frac{1}{t}\big(\varphi(p + th) - \varphi(p)\big) \\ & \leq \lim_{t \downarrow 0} \sup \frac{1}{t}\big(\varphi(p + th) - \varphi(p)\big) \\ & \leq \inf_{x \in \Sigma(p)} \max_{(y, z) \in \mathcal{M}_p(x)} \nabla_p \mathcal{L}(x, y, z, p)^\top h \;. \end{aligned} \end{equation} \]

(55)

Proposition 12 (Corollary 2.3.8 [19])

Let \( f\) and \( h\) be convex functions in \( x\) , \( g\) be affine in \( x\) , such that \( f,g, h\) are all jointly \( C^1\) in \( (x, p)\) . Then, if \( X(p)\) is nonempty and uniformly compact near \( p\) and MFCQ holds at each \( x \in \Sigma(p)\) , then \( \varphi\) is directionally differentiable at \( p\) and

\[ \begin{equation} \varphi'(p; h) = \min_{x \in \Sigma(p)} \max_{(y, z) \in \mathcal{M}_p(x)} \nabla_p \mathcal{L}(x, y, z, p)^\top h \; . \end{equation} \]

(56)

Theorem 10 (Theorem 7.5 [20])

Suppose GSSOSC and MFCQ hold at a stationary solution \( x^\star\) . Then near \( p\) the function \( \varphi(\cdot)\) has a finite one-sided directional derivative, and for every \( h \in \mathbb{R}^{\ell}\) ,

\[ \begin{equation} \varphi'(p; h) = \max_{(y, z) \in \mathcal{M}_p(x^\star)} \nabla_p \mathcal{L}(x^\star, y, z, p)^\top h \; . \end{equation} \]

(58)

Theorem 11 (Theorem 3.4.1 [19])

Suppose SOSC, SCS and LICQ hold at a stationary KKT solution \( (x^\star, y^\star, z^\star)\) . Then the local value function \( \varphi_\ell\) is \( C^2\) near \( p\) , and

  • \( \varphi_\ell(p) = \mathcal{L}(x^\star, y^\star, z^\star, p)\) .
  • \( \nabla_p \varphi_\ell(p) = \nabla_p \mathcal{L}(x^\star, y^\star, z^\star, p)\) .
  • and also

    \[ \begin{equation} \begin{aligned} \nabla^2_{pp} \varphi_\ell(p) = \nabla_p \big(\nabla_p \mathcal{L}(w^\star, p)\big) &= \nabla_{pp}^2 \mathcal{L} + \nabla_{wp}^2 \mathcal{L} \; J_p w(p) \;, \\ &= \nabla_{pp}^2 \mathcal{L} - \nabla_{wp}^2 \mathcal{L} \, (\nabla_{ww}^2 \mathcal{L})^{-1} \nabla_{pw}^2 \mathcal{L} \;. \end{aligned} \end{equation} \]

    (60)

Corollary 1

Suppose the conditions of Theorem 11 hold at a stationary solution \( (x^\star, y^\star, z^\star)\) of problem (61). Then, in a neighborhood of \( p\) the local value function \( \varphi_\ell\) is \( C^2\) , with

  • \( \varphi_\ell(p) = f(x^\star, p)\) .
  • \( \nabla_p \varphi_\ell(p) = \nabla_p f(x^\star)\) .
  • \( \nabla^2_{pp} \varphi_\ell(p) = \nabla_{pp}^2 f + (\nabla_{xp}^2 f) \, J_p x(p)\) .

Corollary 2 (Optimal value function derivatives for RHS perturbations)


Suppose the conditions of Theorem 11 hold at a stationary solution \( (x^\star, y^\star, z^\star)\) of problem (62). Then, in a neighborhood of \( (p, q)\) the local value function \( \varphi_\ell\) is \( C^2\) , with

\[ \begin{equation} \nabla \varphi_\ell(p, q) = \begin{bmatrix} y^\star \\ z^\star \end{bmatrix} ~ \text{and} ~ \nabla^2 \varphi_\ell(p^\star) = \begin{bmatrix} J_p y(p, q) & J_q y(p, q) \\ J_p z(p, q) & J_q z(p, q) \end{bmatrix} \; . \end{equation} \]

(63)

Proposition 13 (Theorem 6.2.1 [19])

Let \( p^\star \in \mathbb{R}^{\ell}\) . Suppose that a stationary solution \( x^\star\) of NLP(\( p\) LICQ, SOSC and SCS hold. Then in a neighborhood of \( (0, p^\star)\) there exists a unique differentiable function \( w(r, p) = (x(r, p), y(r, p), z(r, p))\) satisfying

\[ \begin{equation} \begin{aligned} & \nabla_x \mathcal{L}(x, y, z, p) = 0 \;, \\ & g_j(x, p) = y_j r \; & \forall j \in [m_e] , \\ & z_i h_i(x, p) = r \;, &\forall i \in [m_i] \;, \end{aligned} \end{equation} \]

(75)

and such that \( w(0, p^\star) = (x^\star, y^\star, z^\star)\) . In addition, for any \( (r, p)\) near \( (0, p^\star)\) , \( x(r, p)\) is locally unique unconstrained local minimizing point of \( W(x, r, p)\) with \( h_i(x(r, p), p) < 0\) and \( \nabla^2_{x x} W(x(r, p), r, p)\) positive definite.

Proposition 14 (Convergence of interior-point)

Let \( w^\star = (x^\star, y^\star, z^\star)\) be a local primal-dual stationary point of (78). Suppose that LICQ, SCS and SOSC hold at \( w^\star\) . We denote by \( x(\mu_k, p)\) the solution of the barrier problem (79) at iteration \( k\) . Then

  • There is at least one subsequence of \( x(\mu_k, p)\) converging to \( x^\star\) and for every convergent subsequence, the corresponding barrier multiplier approximation is bounded and converges to \( (y^\star, z^\star)\) .
  • There exists a unique continuously differentiable function \( w(\cdot, p)\) existing in a neighborhood of \( 0\) for \( \mu > 0\) and solution of (79), and such that

    \[ \begin{equation} \lim_{\mu\to 0} w(\mu, p) = w^\star\; . \end{equation} \]

    (81)

  • For \( \mu\) small enough,

    \[ \begin{equation} J_p w(\mu, p) = - (J_w F_\mu)^{-1} J_p F_\mu \; . \end{equation} \]

    (82)

Definition 9 (Directional derivatives)

Let \( x \in \Omega\) . The function \( f\) has a directional derivative at \( x\) in the direction \( h \in E\) if \( x + th \in \Omega\) for \( t\) small enough, and if the following limit exists

\[ \begin{equation} f'(x; h) = \lim_{t \downarrow 0} \; \dfrac{f(x + th) - f(x)}{t} \; . \end{equation} \]

(85)

Definition 10 (Gâteaux-differentiability)

The function \( f\) is Gâteaux differentiable (or \( G\) -differentiable) in \( x \in \Omega\) if there exists a directional derivatives for every direction \( h \in E\) , and if the application

\[ \begin{equation} h \to f'(x; h) \; , \end{equation} \]

(86)

is continuous and linear. We note \( f'(x)\) the operator such that \( f'(x; h) = f'(x) \cdot h\) .

Definition 11 (Fréchet-differentiability)

The function \( f\) is Fréchet differentiable (\( F\) -differentiable, or simply differentiable) at \( x \in \Omega\) if there exists a bounded linear operator \( L\) such that

\[ \begin{equation} \lim_{\|h\| \downarrow 0} \; \dfrac{1}{\|h\|}\Big(f(x + h) - f(x) - L h \Big) = 0 \; . \end{equation} \]

(87)

Definition 12 (Hadamard-differentiability)

The function \( f\) is Hadamard differentiable at \( x\) if for any mapping \( \varphi: \mathbb{R}_+ \to X\) such that \( \varphi(0) = x\) and \( \frac{1}{t} (\varphi(t) - \varphi(0))\) converges to a vector \( h\) as \( t \downarrow 0\) , the limit

\[ \begin{equation} d_H f(x; h) = \lim_{t \downarrow 0} \; \frac{1}{t} (f(\varphi(t)) - f(x)) \; , \end{equation} \]

(89)

does exist.

Proposition 15 (Chain-rule)

Let \( f\) be Hadamard directionally differentiable at \( x\) , and \( g\) be Hadamard directionally differentiable at \( f(x)\) . Then the composite mapping \( g \circ f\) is Hadamard directionally differentiable at \( x\) , and satisfies the chain-rule \( d_H (g \circ f)(x; h) = d_H g (f(x); d_H f(x; y))\) .

Theorem 12 (Rademacher Theorem)

Let \( f: \Omega \to F\) be a locally Lipschitz continuous function. Then \( f\) is differentiable almost everywhere in the sense of the Lebesgue measure.

Definition 13 (Dini-derivatives)

Let \( x \in \Omega\) . The upper Dini derivative in the direction \( h\) is defined as

\[ \begin{equation} D^+ f(x; h) = \lim_{t \downarrow 0} \sup \left( \dfrac{f(x + th) - f(x)}{t} \right) \;. \end{equation} \]

(91)

Accordingly, the lower Dini derivative is defined as

\[ \begin{equation} D^- f(x; h) = \lim_{t \downarrow 0} \inf \left( \dfrac{f(x + th) - f(x)}{t} \right) \;. \end{equation} \]

(92)

Definition 14 (Bouligand-differential)

Let \( \mathcal{D}_f\) be the set of points at which \( f\) is (Fréchet) differentiable. The \( B\) -differential of \( f\) at \( x \in \Omega\) is the set defined as

\[ \begin{equation} \partial_B f(x) = \big\{ J \in \mathcal{L}(E, F) \; : \; \exists \{x_k \}_k \in \mathcal{D}_f \; \text{such that} \; x_k \to x \;, f'(x_k) \to J \big\} \; . \end{equation} \]

(93)

Definition 15 (Clarke-differential)

The \( C\) -differential of \( f\) at \( x \in \Omega\) is the convex hull of the Bouligand-differential, that is

\[ \begin{equation} \partial_C f(x) = \text{conv} \; \partial_B f(x) \;. \end{equation} \]

(94)

Definition 16 (Clarke-generalized directional derivative)


The Clarke-generalized directional derivative of \( f\) in the direction \( h\) is defined by

\[ \begin{equation} d_C f(x; h) = \lim_{y \to x} \sup_{t \downarrow 0} \frac 1t \big[f(y + th) - f(y) \big] \; . \end{equation} \]

(95)

Proposition 16

Suppose that \( f\) is \( L\) -Lipschitz near \( x \in \Omega\) . Then

  1. \( \partial_C f(x)\) is nonempty compact and convex,
  2. \( \partial_C f(x)\) is locally bounded and upper semi-continuous at \( x\) .
  3. \( d_C f(x; \cdot)\) is the support function of \( \partial_C f(x)\) , that is,

    \[ \begin{equation} d_C f(x; h) = \max \{ s^\top h \; : \; \forall s \in \partial_C f(x) \} \;. \end{equation} \]

    (96)

Proposition 17

If \( f\) is \( G\) -differentiable at \( x\) , then \( f'(x) \in \partial_C f(x)\) . If in addition \( f\) is \( C^1\) at \( x\) , then \( \partial_C f(x) = \{ f'(x) \}\) .

Proposition 18 (Chain-rule)

Suppose that \( f\) is \( L\) -Lipschitz continuous on \( \Omega\) . If the function \( g:F \to G\) is Lipschitz-continuous near \( f(x)\) , then

\[ \begin{equation} \partial_C (g \circ f) (x) \subset \text{conv} \big\{ G F \; : \; G \in \partial_C g(f(x)) \;, \; F \in \partial_C f(x) \big\} \; . \end{equation} \]

(97)

Definition 17 (Semi-smoothness)

The function \( f:\Omega \to F\) is semi-smooth in \( x \in \Omega\) if:

  1. \( f\) is \( L\) -Lipschitz near \( x\) ;
  2. \( f\) has directional derivatives at \( x\) in all directions;
  3. When \( h \to 0\) , \( \sup_{J \in \partial_C f} \|f(x+h) -f(x) - Jh \| = o(\|h\|)\) .

Proposition 19

Suppose \( f\) is \( L\) -Lipschitz near \( x\) , and admits directional derivatives at \( x\) in all directions. Then the following statements are equivalent:

  1. \( f\) is semi-smooth ;
  2. for \( h \to 0\) , \( \sup_{J \in \partial_C f} \|Jh - f'(x;h)\| = o(\|h\|)\) .
  3. for \( h \to 0\) such that \( x+h \in \mathcal{D}_f\) , \( f'(x+h) h - f'(x; h)= o(\|h\|)\) .

Definition 18

The function \( f:X \to F\) is piecewise-differentiable (PC\( ^1\) ) at \( x \in X\) if there exists a neighborhood \( N \subset X\) and a finite collection of \( C^1\) selection functions \( \mathcal{F}_f(x) = \{f_{(1)}, …, f_{(k)} \}\) defined on \( N\) such that \( f\) continuous on \( N\) and for all \( y \in N\) , \( f(y) \in \{f_{(i)}(y)\}_{i \in [k]}\) .

Proposition 20

Let \( f:X \to F\) a PC\( ^1\) function at \( x \in X\) . Then

  • \( \mathcal{I}_f^{ess}(x)\) is a non-empty set;
  • \( f\) is Lipschitz continuous near \( x\) and directionally differentiable at \( x\) , with

    \[ \begin{equation} \partial_B f(x) = \{ \nabla f_{(i)}(x) \; : \; i \in \mathcal{I}_f^{ess}(x) \}. \end{equation} \]

    (99)

    In addition, the directional derivative \( f'(x; \cdot)\) is piecewise linear.

Definition 19 (Coherent orientation)

Let \( U \subset \mathbb{R}^n\) and \( V \subset \mathbb{R}^m\) be open. A \( PC^1\) function \( F:U \times W \to \mathbb{R}^n\) is said coherently oriented with respect to \( x\) at \( (x^\star, y^\star) \in U \times V\) if all matrices in \( \partial_{B,x} f(x^\star, y^\star)\) have the same non-vanishing determinant sign.

Definition 20 (Lexicographically smooth function)

Let \( f:X \to \mathbb{R}^m\) a locally Lipschitz continuous function on \( X\) . \( f\) is lexicographically smooth at \( x\) if for any \( k \in \mathbb{N}\) and any \( M = [m_{(1)}, …, m_{(k)}] \in \mathbb{R}^{n \times k}\) the following higher-order derivatives are well defined

\[ \begin{equation} \begin{aligned} & f^{(0)}_{x, M}: \mathbb{R}^n \to \mathbb{R}^m : d \to f'(x; d) \\ & f^{(j)}_{x, M}: \mathbb{R}^n \to \mathbb{R}^m : d \to [f^{(j-1)}_{x, M}]'(m_{(j)}; d) & \forall j =1,\cdots, k \\ \end{aligned} \end{equation} \]

(100)

Definition 21 (Lexicographic derivative)

Let \( f:X \to \mathbb{R}^m\) a L-smooth function at \( x\) , and \( M \in \mathbb{R}^{n \times n}\) a nonsingular matrix. The lexicographic derivative of \( f\) at \( x\) in the direction \( M\) is defined as

\[ \begin{equation} d_L f(x; M) = \nabla f^{(n)}_{x, M}(0) \in \mathbb{R}^{m \times n} \;. \end{equation} \]

(101)

The lexicographic subdifferential of \( f\) at \( x\) is defined as

\[ \begin{equation} \partial_L f(x) = \{ d_L f(x; N) \; : \; N \in \mathbb{R}^{n\times n} \;,\; \text{det}(N) \neq 0 \} \;. \end{equation} \]

(102)

For any \( k \in \mathbb{N}\) , \( M = [m_{(1)}, …, m_{(k)}] \in \mathbb{R}^{n \times k}\) the LD-derivative of \( f\) in the directions \( M\) is defined as

\[ \begin{equation} f'(x; M) = \big[ f^{(0)}_{x, M}(m_{(1)}), ~ f^{(1)}_{x, M}(m_{(2)}), ~ \cdots, f^{(k-1)}_{x, M}(m_{(k)}) \big] \; . \end{equation} \]

(103)

Proposition 21 (Chain-rule)

Let \( h:X \to Y\) and \( g:Y \to \mathbb{R}^q\) be L-smooth at \( x \in X\) and \( h(x) \in Y\) , respectively. Then the function \( g\circ h\) is L-smooth at \( x \in X\) , and for any \( k \in \mathbb{N}\) and \( M \in \mathbb{R}^{n \times k}\) ,

\[ \begin{equation} [g \circ h]'(x; M) = g'(h(x); h'(x; M)) \; . \end{equation} \]

(105)

Definition 22

Let \( \Gamma: E \rightrightarrows F\) a point-to-set map.

  • \( \Gamma\) is upper semicontinuous at \( x \in E\) if for each open set \( \Omega \subset F\) satisfying \( \Gamma(x) \subset \Omega\) there exists a neighborhood \( N(x)\) of \( x\) such that for all \( y \in N(x)\) , \( \Gamma(y) \subset \Omega\) .
  • \( \Gamma\) is lower semicontinuous at \( x \in E\) if for each open set \( \Omega \subset F\) satisfying \( \Gamma(x) \cap \Omega \neq \varnothing\) there exists a neighborhood \( N(x)\) of \( x\) such that for all \( y \in N(x)\) , \( \Gamma(y) \cap \Omega \neq \varnothing\) .
  • \( \Gamma\) is continuous at \( x\) if it is lower semicontinuous and upper semicontinuous at \( x\) .

Definition 23

The point-to-set map \( \Gamma:E \rightrightarrows F\) is closed at \( x\) if there exists a sequence \( x_n \in E\) , \( x_n\to x\) such that \( y_n \in \Gamma(x_n)\) and \( y_n \to y\) imply \( y \in \Gamma(x)\) .

Definition 24

The point-to-set map \( \Gamma:E \rightrightarrows F\) is open at \( x\) if \( y \in \Gamma(x)\) implies there exists a sequence \( x_n \in E\) , \( x_n\to x\) such that there exists \( m\) and \( \{y_n\}\) such that \( y_n \in \Gamma(x_n)\) for all \( n \geq m\) and \( y_n \to y\) .

Definition 25

The point-to-set map \( \Gamma:E \rightrightarrows F\) is uniformly compact near \( \hat{x}\) if the set \( \cup_{x \in N(\hat{x})} \Gamma(x)\) is bounded for some neighborhood \( N(\hat{x})\) of \( \hat{x}\) .

Definition 26

The point-to-set map \( \Gamma: E \rightrightarrows F\) is upper Lipschitzian with modulus \( L\) at \( \hat{x} \in E\) if there is a neighborhood \( N\) of \( \hat{x}\) such that for each \( x \in N\) , \( \Gamma(x) \subset \Gamma(\hat{x}) + L \|x - \hat{x} \|B\) , with \( B\) the unit-ball in \( F\) .

Definition 27 (Normal cone)

We define the normal cone operator associated to the set \( C\) as, for \( x \in X\)

\[ \begin{equation} N_C(x) = \left\{ \begin{aligned} & \{ y \in X^\star \; : \; \langle y , v - x \rangle \leq 0 \;, \; \forall v \in C \} & \text{if} ~ x \in C \; ,\\ & \varnothing\;, & \text{if}~ x \notin C \; . \end{aligned} \right. \end{equation} \]

(109)

Definition 28 (Variational inequality)

Let \( p \in P\) be a parameter. For \( f: X \times P \to X^\star\) , we define the paramaterized variational inequality (also called generalized equation) the problem of finding \( x \in X\) such that

\[ \begin{equation} f(x, p) + N_C(x) \ni 0 \; . \end{equation} \]

(110)

Definition 29 (Strong regularity [8])

We say that (110) is strongly regular at a solution \( x_0\) with Lipschitz constant \( L\) if there exists a neighborhood \( U \subset X^\star\) of the origin and \( V \subset X\) of \( x_0\) such that the restriction to \( U\) of \( T^{-1} \cap V\) is a single-valued function from \( U\) to \( V\) and is \( L\) -Lipschitzian on \( U\) .

Theorem 13 (Theorem 2.1, [8])

Let \( p \in P\) be a given parameter, \( x_0\) solution of the generalized equation (110). Suppose that \( f\) is Fréchet-differentiable w.r.t. \( x\) exists at \( p\) , and that both \( f(\cdot, \cdot)\) and \( \nabla_x f(\cdot, \cdot)\) are continuous at \( (x_0, p)\) . If (110) is strongly regular at \( x_0\) , then for any \( \varepsilon >0\) there exists neighborhood \( N\) of \( p\) and \( W\) of \( x_0\) as well as a single-valued function \( x:N \to W\) sucht that for any \( p \in N\) , \( x(p)\) is the unique solution of the inclusion

\[ \begin{equation} f(x, p) + N_C(x) \ni 0 \; , \end{equation} \]

(115)

and, for all \( p, q \in N\) , one has

\[ \begin{equation} \| x(p) - x(q) \| \leq (L + \varepsilon) \| f(x(p), p) - f(x(q), q) \| \; . \end{equation} \]

(116)

Corollary 3 (Locally-Lipschitz condition)

Suppose the hypotheses of Theorem 13 hold. Suppose in addition there exists a constant \( \kappa\) such that for each \( (p, q) \in N\) , for each \( x \in W\) , one has

\[ \begin{equation} \| f(x, p) - f(x, q) \| \leq \kappa \| p - q \| \; . \end{equation} \]

(117)

Then \( x(\cdot)\) is Lipschitzian on \( N\) with modulus \( \kappa (L + \varepsilon)\) .

References

[1] J. Gauvin, A necessary and sufficient regularity condition to have bounded multipliers in nonconvex programming, Mathematical Programming, 12 (1977), pp. 136–138.

[2] J. Kyparisis, On uniqueness of Kuhn-Tucker multipliers in nonlinear programming, Mathematical Programming, 32 (1985), pp. 242–246.

[3] A. V. Fiacco, Sensitivity analysis for nonlinear programming using penalty methods, Mathematical programming, 10 (1976), pp. 287–311.

[4] S. Demko, W. F. Moss, and P. W. Smith, Decay rates for inverses of band matrices, Mathematics of computation, 43 (1984), pp. 491–499.

[5] Exponential decay of sensitivity in graph-structured nonlinear programs, SIAM Journal on Optimization, 32 (2022), pp. 1156–1183.

[6] L. F. Valenzuela, R. Brown, and M. Pavone, Decentralized implicit differentiation, arXiv preprint arXiv:2403.01260, (2024).

[7] S. Shin, M. Anitescu, and V. M. Zavala, Overlapping schwarz decomposition for constrained quadratic programs, in 2020 59th IEEE Conference on Decision and Control (CDC), IEEE, 2020, pp. 3004–3009.

[8] Strongly regular generalized equations, Mathematics of Operations Research, 5 (1980), pp. 43–62.

[9] K. Jittorntrum, Solution point differentiability without strict complementarity in nonlinear programming, Sensitivity, Stability and Parametric Analysis, (1984), pp. 127–138.

[10] P. Stechlinski, K. A. Khan, and P. I. Barton, Generalized sensitivity analysis of nonlinear programs, SIAM Journal on Optimization, 28 (2018), pp. 272–301.

[11] P. Stechlinski, J. Jäschke, and P. I. Barton, Generalized sensitivity analysis of nonlinear programs using a sequence of quadratic programs, Optimization, 68 (2019), pp. 485–508.

[12] M. Kojima, Strongly stable stationary solutions in nonlinear programs, in Analysis and computation of fixed points, Elsevier, 1980, pp. 93–138.

[13] A. Shapiro, Second order sensitivity analysis and asymptotic theory of parametrized nonlinear programs, Mathematical Programming, 33 (1985), pp. 280–299.

[14] Sensitivity analysis for nonlinear programs and variational inequalities with nonunique multipliers, Mathematics of operations research, 15 (1990), pp. 286–298.

[15] S. Dempe, Directional differentiability of optimal solutions under Slater's condition, Mathematical programming, 59 (1993), pp. 49–69.

[16] D. Ralph and S. Dempe, Directional derivatives of the solution of a parametric nonlinear program, Mathematical programming, 70 (1995), pp. 159–172.

[17] J. M. Danskin, The theory of max-min and its application to weapons allocation problems, vol. 5, Springer Science & Business Media, 1967.

[18] W. W. Hogan, Point-to-set maps in mathematical programming, SIAM review, 15 (1973), pp. 591–603.

[19] Introduction to sensitivity and stability analysis in non linear programming, New York: Academic Press, 1983.

[20] A. V. Fiacco and J. Kyparisis, Sensitivity analysis in nonlinear programming under second order assumptions, in Systems and Optimization: Proceedings of the Twente Workshop Enschede, The Netherlands, April 16–18, 1984, Springer, 1984, pp. 74–97.