Open in Colab

Week 6

Exercise 1


Let \(A\in \mathbb S^n\), \(B\in\mathbb S^m\) be symmetric matrices and let \(X\in \mathbb R^{n\times m}\). We consider the symmetric block matrix

\[\begin{split}M = \begin{pmatrix}A & X \\ X^\top &B\end{pmatrix}\end{split}\]

Exercise 1a)


Show that if \(M\succ 0\) then \(A\succ 0\) and \(B\succ 0\). Similarly, show that if \(M\succeq 0\) then \(A\succeq 0\) and \(B\succeq 0\)

Let \(u\in \mathbb R^n\) and \(v\in \mathbb R^m\). First assume \(u\neq 0\) and \(v=0\), then

\[\begin{split}u^\top A u = \begin{pmatrix}u\\ v\end{pmatrix}^\top M \begin{pmatrix}u\\ v\end{pmatrix} >0\end{split}\]

hence \(A\succ 0\). Similarly picking \(u=0\) and \(v\neq 0\) shows \(B\succ 0\).

The result for \(M \succeq 0\) follows by replacing the \(>\) above by \(\geq\).

Exercise 1b)


Assuming \(B\succ 0\), define the \(\textit{Schur complement}\) \(S\) of \(A\) in \(M\) by

\[S := A-X B^{-1} X^\top\]

Find a non-singular lower triangular matrix \(C\) of form

\[\begin{split}C = \begin{pmatrix}I_n & 0 \\ E & I_m\end{pmatrix}\end{split}\]

such that

\[\begin{split}C^\top M C = \begin{pmatrix}S & 0 \\ 0 &B\end{pmatrix}\end{split}\]


\[\begin{split}C = \begin{pmatrix}I_n&0 \\ -B^{-1}X^\top & I_m\end{pmatrix}\end{split}\]


\[\begin{split}\begin{align} C^\top M C &= \begin{pmatrix}I_n& E^\top \\ 0 & I_m\end{pmatrix} \begin{pmatrix}A & X \\ X^\top &B\end{pmatrix} \begin{pmatrix}I_n&0 \\ E^\top & I_m\end{pmatrix}\\ &= \begin{pmatrix}A +E^\top X^\top & X+E^\top B \\ X^\top &B\end{pmatrix} \begin{pmatrix}I_n&0 \\ E & I_m\end{pmatrix}\\ &= \begin{pmatrix}A+E^\top X^\top +XE+E^\top B E & X+E^\top B \\ X^\top + BE & B \end{pmatrix} \end{align}\end{split}\]

This gives that \(E = -B^{-1} X^\top\) (\(B\) is invertible since \(B\succ 0\)). The top left entry is then

\[\begin{split}\begin{align} A+E^\top X^\top + XE+E^\top BE &= A-X B^{-1} X^\top - XB^{-1}X^\top + XB^{-1}BB^{-1}X^\top \\ &= A-X B^{-1} X^\top\\ &=S \end{align}\end{split}\]

as required.

Exercise 1c)


Let \(A\in\mathbb S^n\) be a symmetric matrix, and let \(X\in R^{n\times n}\) be a non-singular matrix. Show that

\[A\succ 0 \quad\Leftrightarrow \quad X^\top A X \succ 0\]

Similarly also show that

\[A\succeq 0 \quad\Leftrightarrow \quad X^\top A X \succeq 0\]

Suppose \(A\succ 0\) and let \(x\neq 0\), then

\[x^\top X^\top A X x = (X x)^\top A (X x) >0\]

since \(Xx\neq 0\) by the fact that \(X\) is non-singular. Hence \(X^\top A X\succ 0\).

Now suppose \(X^\top A X\succ 0\) and let \(x\neq 0\). Then define \(y = X^{-1}x\). We have,

\[x^\top A x = (Xy)^\top A Xy = y^\top(X^\top A X) y >0\]

we conclude that \(A\succ 0\).

The statement for \(A\succeq 0\) has the same proof, replacing \(>\) with \(\geq\).

Exercise 1d)


Assume \(B\succ 0\).

Prove that \(M\succ 0\) if and only if \(S\succ 0\).

Similarly, prove that \(M\succeq 0\) if and only if \(S\succeq 0\).

By c) we have that \(M\succ 0\) if and only if \(C^\top M C\succ 0\). If \(C^\top M C\succ 0\) then \(S\succ 0\) by a).

Similarly if \(S\succ 0\) then \(C^\top M C\succ 0\) since \(B\succ 0\), and hence \(M\succ 0\).

Again, the proof for \(M\succeq 0\) is identical.

Exercise 1e)


Suppose \(M\succ 0\). Using the results of a)-d), prove by induction that \(M\) admits a decomposition

\[M = L^\top D L\]

where \(L\) is non-singular and lower triangular, and \(D\) is diagonal.

If \(M\) is a \(1\times 1\) matrix, the result is trivial. Assume the result holds if \(M\) is an \(n\times n\) matrix, and write it in block form:

\[\begin{split}M = \begin{pmatrix}A & X \\ X^\top &B\end{pmatrix}\end{split}\]

Then since \(M\succ 0\), we have \(B\succ 0\) (exercise 1a), hence we can use

\[\begin{split}C^\top M C = \begin{pmatrix}S & 0 \\ 0 &B\end{pmatrix}\end{split}\]

with \(C\) as in 1b) lower-triangular and non-singular. By 1d) we have that \(S\succ 0\), and both \(S\) and \(B\) are of size \(\leq n\). By induction we can thus write

\[\begin{split}\begin{align} \begin{pmatrix}S & 0 \\ 0 &B\end{pmatrix} &= \begin{pmatrix}L_1^\top D_1 L_1 & 0 \\ 0 &L_2^\top D_2 L_2\end{pmatrix}\\ &=\begin{pmatrix}L_1 & 0\\ 0 & L_2\end{pmatrix}^\top\begin{pmatrix} D_1& 0 \\ 0 &D_2\end{pmatrix}\begin{pmatrix}L_1 & 0\\ 0 & L_2\end{pmatrix} \end{align}\end{split}\]

Then finally we have \(M = L^\top DL\) with \(D = \operatorname{diag}(D_1,D_2)\) and

\[\begin{split}L = \begin{pmatrix}L_1 & 0\\ 0 & L_2\end{pmatrix}\begin{pmatrix}I_n&0 \\ -B^{-1}X^\top & I_m\end{pmatrix}.\end{split}\]

Exercise 2


Recall that a semidefinite programming (SDP) problem is of form

\[\min\left\{\langle C,\,X\rangle\mid X\in \mathbb S_+^n,\quad \langle A_i,\,X\rangle = b_i,\quad i=1,\dots, m\right\},\]

where \(n,m\in \mathbb N\), \(C,A_1,\dots, A_m\in \mathbb S^n\) are symmetric matrices, \(X\in \mathbb S_+^n\) is symmetric positive semidefinite, and \(b_i\in \mathbb R\).

A linear programming problem (LP) is a problem of form

\[\min\left\{\langle c,\,x\rangle\mid x\geq 0,\quad \langle a_i,\,x\rangle = b_i,\quad i=1,\dots,m\right\}\]

with \(c,x,a_1,\dots,a_m\in \mathbb R^n\) and \(b_1,\dots,b_m\in \mathbb R\).

We will show that an LP is equivalent to an SDP with \(C=\operatorname{diag} c\) and \(A_i=\operatorname{diag} a_i\). We do this by showing that a solution to the LP problem also solves the associated SDP (and vice versa).

Exercise 2a)


  1. Show that if \(x\in \mathbb R^n\) is feasible for the LP, then \(X = \operatorname{diag} x \in \mathbb R^{n\times n}\) is feasible for the SDP.

  2. Show that if \(X\in \mathbb R^{n\times n}\) is feasible for the SDP, then \(x = \operatorname{diag} X \in \mathbb R^{n}\) is feasible for the LP.

  1. Let \(x\in \mathbb R^n\) be feasible for the LP. Then \(x\geq 0\) implies that \(X = \operatorname{diag} x\) is positive semidefinite. Furthermore note that since

    \[\langle \operatorname{diag} a_i, \operatorname{diag} x \rangle = \langle a_i, x \rangle = b_i\]

we have that \(X\) is feasible for the SDP.

  1. Let \(X\in \mathbb R^{n\times n}\) be feasible for the SDP. Then since \(X_{ii} = e_i^\top X e_i\geq 0\), we have that \(\operatorname{diag} X \geq 0\). By the same reason as above , we have that \(\langle a_i,\operatorname{diag} X \rangle = b_i\), and \(x\) is feasible for the LP.

Exercise 2b)


  1. Show that if \(x\in \mathbb R^n\) solves the LP, then \(X=\operatorname{diag} x\) solves the SDP.

  2. Show that if \(X\in \mathbb R^{n\times n}\) solves the SDP, then \(x=\operatorname{diag} X\) solves the LP.

  1. Consider \(x\) solving the LP and let \(X = \operatorname{diag} x\). Then if \(X'\) is any feasible point for the SDP, we have that \(\operatorname{diag} X'\) is feasible for the LP. Note that therefore

    \[\langle \operatorname{diag} x,\operatorname{diag} c \rangle = \langle x,c \rangle \leq \langle \operatorname{diag} X' , c\rangle = \langle X' , \operatorname{diag} c\rangle\]

where the inequality in the middle is because of optimality of \(x\). We conclude that \(X\) solves the SDP.

  1. Consider \(X\) solving the SDP and let \(x=\operatorname{diag} X\). Then if \(x'\) is any feasible point for the LP, we have that \(\operatorname{diag} x'\) is feasible for the SDP. Then note that

    \[\langle x, c\rangle = \langle X, \operatorname{diag} c\rangle \leq \langle \operatorname{diag} x', \operatorname{diag} c\rangle = \langle x', c\rangle\]

The inequality in the middle is again by optimality of \(X\). This implies that \(x\) solves the LP.

Exercise 3


Recall that a second-order cone program (SOCP) uses a constraint on the variable \(x\in \mathbb R^n\) of form

\[\| Dx+b\|_2 \leq c^\top x +d\]

with \(D\in \mathbb R^{l\times n}\), \(b \in \mathbb R^l\), \(c\in \mathbb R^n\) and \(d\in \mathbb R\).

We will show that an SOCP constraint is a special case of an SDP constraint.

Exercise 3a)


Show that this constraint is equivalent to the constraint \(X(x)\succeq 0\) with

\[\begin{split}X(x) = \begin{pmatrix}c^\top x+ d & (Dx+b)^\top \\ Dx+b & (c^\top x +d)I_l \end{pmatrix}\end{split}\]

Hint: compute the Schur complement of this block matrix, and use the result of 1d). Consider the case where \(c^\top x +d =0\) and \(c^\top x+ d >0\) seperately.

First of all \(X(x)\succeq\) only if \(c^\top x+d\geq 0\), which is also implied by the SOCP constraint.

Now suppose first of all that \(c^\top x+d = 0\), then in particular the diagonal of \(X(x)\) is zero, so \(X(x)\succeq 0\) if and only if \(X(x)=0\). This is because any princpial submatrix of a PSD matrix is PSD, this means in particular that the submatrix given by the first and \((i+1)\)st column/roms is PSD, i.e.

\[\begin{split}\begin{pmatrix}0 &(Dx+b)_i \\ (Dx+b)_i & 0\end{pmatrix} \succeq 0\end{split}\]

which happens if and only if \(Dx+b=0\), which is the SOCP constraint in this case.

Now assume \(c^\top x+d> 0\). Then \(X(x)\succeq 0\) if and only if \(S\succeq 0\) with \(S\) the Schur complement. The Schur complement is given by

\[S = c^\top x+d - (Dx+b)^\top \left((c^\top x+d)I\right)^{-1}(Dx+b) = c^\top x+d - \frac{\|Dx+b\|^2}{c^\top x + b}\]

Since \(S\) is just a number, we have that \(S\succeq 0\) if and only if

\[(c^\top x +d)^2 \geq \|Dx+b\|^2\]

which is equivalent to the SOCP constraint. Hence the SOCP constraint and the constraint \(X(x)\succeq 0\) are equivalent.

Exercise 3b)


We will see later in the course that an SDP constraint is equivalent to a Linear matrix inequality (LMI) constraint. An LMI constraint is a constraint on \(x\in \mathbb R^n\) is a constraint of form

\[A_0 + \sum_i x_i A_i \succeq 0\]

where \(A_i\in \mathbb S^n\) are all symmetric matrices.

Show that the constraint \(X(x)\succeq 0\) is an LMI.

Write \(D_i\) the \(i\)th column of \(D\). Then note that

\[\begin{split}X(x) = \begin{pmatrix} d+\sum_i c_i x_i & b^\top + \sum_i D_i^\top x_i \\ b + \sum_i D_i x_i & \left(d+\sum_i c_i x_i\right)I \end{pmatrix}\end{split}\]

Hence \(X(x)\succeq 0\) is equivalent to the LMI with

\[\begin{split}A_0 = \begin{pmatrix} d & b^\top \\ b & d I_l \end{pmatrix},\qquad A_i = \begin{pmatrix} c_i & D_i^\top \\ D_i & c_i I_l\end{pmatrix}\end{split}\]

Exercise 4


Prove the following problems are convex optimization problems. In all these problems \(x\in\mathbb R\) is a scalar variable.

Exercise 4a)


Let \(p,q\) be posynomials. Prove the following problem is convex.

\[\min_{x> 0}\max\{p(x),q(x)\}\]

The problem is equivalent to the following GP:

\[\begin{split}\begin{array}{ll} \text{Minimize} & t\\ \text{Subject to} & p(x) / t\leq 1 \\ & q(x) / t \leq 1 \end{array}\end{split}\]

Any GP is in particular convex.

Exercise 4b)


Let \(p,q\) be posynomials, and let \(r(x) = x^\alpha\) for some \(\alpha\in \mathbb R\). Prove the following problem is convex

\[\begin{split}\begin{array}{ll} \text{Minimize} & \dfrac{p(x)}{r(x)-q(x)} \\ \text{Subject to} & r(x) > q(x) \end{array}\end{split}\]

The problem is equivalent to

\[\begin{split}\begin{array}{ll} \text{Minimize} & t\\ \text{Subject to} & p(x) \leq t(r(x)-q(x)) \end{array}\end{split}\]

Since the left hand side of the inequality is strictly positive, this in particular implies \(r(x)>q(x)\). This problem is then equivalent to the GP

\[\begin{split}\begin{array}{ll} \text{Minimize} & t\\ \text{Subject to} & \dfrac{p(x)/t + q(x)}{r(x)} \leq 1 \end{array}\end{split}\]

since any GP is in particular convex, we conclude that the problem is convex.

Exercise 4c)


Let \(p,q\) be posynomials. Prove the following problem is convex

\[\min_{x> 0} \exp(p(x)) + \exp(q(x))\]

Hint: Consider the equivalent problem

\[\begin{split}\begin{array}{ll} \text{Minimize} & \exp(t_1)+\exp(t_2)\\ \text{Subject to} & p(x) \leq t_1 \\ & q(x) \leq t_2 \end{array}\end{split}\]

Then use the transformation \(x=e^y\), and make use of the fact that \(p\) and \(q\) are posynomials to write the constraints as an inequalities involving log-sum-exp.

The problem is equivalent to the following:

\[\begin{split}\begin{array}{ll} \text{Minimize} & \exp(t_1)+\exp(t_2)\\ \text{Subject to} & p(x) \leq t_1 \\ & q(x) \leq t_2 \end{array}\end{split}\]

We do the transformation \(x = \exp(y)\). Let \(p(x) = \sum_i a_i x^{\alpha_i}\) and \(q(x) = \sum_i b_i x^{\beta_i}\) then \(p(x) = \sum_i \exp(\alpha_i y+\log(a_i))\), and \(q(x) = \sum_i\exp(\beta_i y+\log(b_i))\). We then get equivalent the problem

\[\begin{split}\begin{array}{ll} \text{Minimize} & \exp(t_1)+\exp(t_2) \\ \text{Subject to} & \sum_i \exp(\alpha_i y+\log(a_i)) \leq t_1 \\ & \sum_i \exp(\beta_i y+\log(b_i)) \leq t_2 \end{array}\end{split}\]

Since the left hand sides are positive, we can we take \(\log\) on both sides of the inequalities to get the problem below. We denote by \(\operatorname{lse}\) the log-sum-exp function.

\[\begin{split}\begin{array}{ll} \text{Minimize} & \exp(t_1)+\exp(t_2) \\ \text{Subject to} & \operatorname{lse}(\alpha y + \log(a)) - \log(t_1) \leq 0 \\ & \operatorname{lse}(\beta y + \log(b)) - \log(t_2) \leq 0 \end{array}\end{split}\]

where \(\alpha,\beta,a,b\) are the vectors given by the respective values \(\alpha_i,\beta_i,a_i,b_i\). Since \(\exp\), \(\operatorname{lse}\) and \(-\log\) are convex, this is a convex problem.