Open in Colab

Week 2

Exercise 1

Exercise

Let \(S\subset \mathbb{R}^n\) be a closed convex set and let \(f\) be function \(f(x)=Ax+b\), with \(A\in\mathbb{R}^{n\times n}\), and \(b\in\mathbb{R}^n\).

(Actually we don’t need \(S\) closed for any of this)

Exercise 1a)

Exercise

Prove that \(f(S)=\left\{f(x)\,|\, x\in S\right\}\) is convex.

Let \(z=f(x)\in f(S)\) and \(w=f(y)\in f(S)\). Then for \(\lambda \in [0,1]\) we have

\[\lambda f(x)+(1-\lambda) f(y)=f(\lambda x+(1-\lambda) y)\]

By convexity of \(S\) we have that \(\lambda x+(1-\lambda y)\in S\), and hence \(f(S)\) is convex.

Exercise 1b)

Exercise

Prove that \(f^{-1}(S)=\left\{x\in\mathbb{R}^n\,|\,f(x)\in S\right\}\) is convex.

Suppose \(f(x),f(y)\in S\). Then

\[f(\lambda x+(1-\lambda)y) = \lambda f(x) +(1-\lambda)f(y),\]

which is again in \(S\) by convexity. Hence \(f^{-1}(S)\) is convex.

Exercise 1c)

Exercise

Suppose \(T\subset \mathbb{R}^n\) is convex. Prove that the Minkowski sum \(S+T=\left\{x+y\,|\,x\in S\,,y\in T\right\}\) is also convex.

Fix the points \(s_1+t_1,s_2+t_2\in S+T\), with \(s_i\in S, t_i\in T\), \(i=1,2\). Choose \(\lambda\in(0,1)\) and consider the point

\[z=\lambda(s_1+t_1)+(1-\lambda)(s_2+t_2)=\lambda s_1+(1-\lambda)s_2+ \lambda t_1+(1-\lambda)t_2 \,.\]

Since \(\lambda s_1+(1-\lambda)s_2\in S\) and \(\lambda t_1+ (1-\lambda)t_2\in T\) by convexity of \(S\) and \(T\) respectively, \(z\in S+T\).

Exercise 2

Exercise

Let \(\Omega\) be a convex set. Show that \(f\colon \Omega\to \mathbb R\) is convex if and only if \(\operatorname{epi}(f)\) is convex.

Recall \(\operatorname{epi}(f) = \{(x,t)\in \Omega\times \mathbb R\mid f(x)\leq t\}\)

\(\Leftarrow\)” Let \(x,y\in \Omega\), then in particular \((x,f(x)),\,(y,f(y))\in \operatorname{epi}(f)\). Let \(t\in (0,1)\), then by convexity of the epigraph we have that

\[(tx+(1-t)y,\,tf(x)+(1-t)f(y))\in \operatorname{epi}(f)\]

This means that

\[f(tx+(1-t)y)\leq tf(x)+(1-t)f(y)\]

which is the definition of convexity.

\(\Rightarrow\)” Suppose that \(f\) is convex and let \((x,a)\), \((y,b)\in \operatorname{epi}(f)\), and \(t\in (0,1)\). Then we have that \(f(x)\leq a\) and \(f(y)\leq b\). We then consider the convex combination

\[z_t=(tx+(1-t)y, ta+(1-t)b)\]

By convexity,

\[f(tx+(1-t)y)\leq tf(x)+(1-t)f(y)\leq ta+(1-t)b\]

Inspecting the first and last inequalities tells us that, \(z_t\in \operatorname{epi}(f)\), and hence \(\operatorname{epi}(f)\) is convex.

Exercise 3

Exercise

For a compact set \(S \subset \mathbb{R}^{n}\) with \(n\in\mathbb{N}\), show that \(\operatorname{conv}(S)\), the convex hull of \(S\), is a compact set.

Hint: recall that if \(S\subset\mathbb{R}^n\) then every \(x\in \operatorname{conv}(S)\) can be written as a convex combination of \(n+1\) points of \(S\) (Carathéodory’s Theorem).

Short proof: Let

\[\Lambda:=\{\mathbf\lambda\in [0,1]^{n+1}\mid \sum_{i=0}^n\lambda_i = 1\}\]

This space is compact, since it’s a closed subspace of \([0,1]^{n+1}\). By Carathéodory’s theorem the following map is surjective:

\[f\colon S^{n+1}\times \Lambda \to \operatorname{conv}(S),\qquad (y^0,\alpha^0,\dots,y^n,\alpha^n)\mapsto \sum_{i=0}^n\alpha^iy^i\]

Since \(\operatorname{conv}(S)\) is the image of a continuous map from a compact space, it is also compact.


Using sequential compactness definition: Let \(\{x_k\}_{k\in \mathbb N}\subset \operatorname{conv}(S)\) be a sequence, then we will show it has a convergent subsequence with limit in \(\operatorname{conv}(S)\), i.e. we will show that \(\operatorname{conv}(S)\) is sequentially compact which is equivalent to compactness for metric spaces.

By Carathéodory’s theorem for each \(x_k\) there are \(y_k^i\in S\) and \(\alpha_k^i\in[0,1]\) such that \(\sum_{i=0}^n\alpha_k^iy_k^i=x_k\). This defines a sequence

\[z_k = (y_k^0,\alpha_k^0,\dots,y_k^n,\alpha_k^n) \in S^{n+1}\times \Lambda\]

Since \(S^{n+1}\times \Lambda\) is compact, \(z_k\) has a convergent subsequence \(\{z_{k_m}\}\). Since the map

\[f\colon S^{n+1}\times \Lambda\to \operatorname{conv}(S),\qquad (y^0,\alpha^0,\dots,y^n,\alpha^n)\mapsto \sum_{i=0}^n\alpha^iy^i\]

is continuous, we have that \(f(z_{k_m})=x_{k_m}\) defines a convergent subsequence of \(\{x_k\}\). We conclude that \(\operatorname{conv}(S)\) is indeed compact.

Exercise 4

Exercise

For \(n\in\mathbb{N}\), \(A\in\mathbb{R}^{n \times n}\), \(b\in\mathbb{R}^n\) and \(c\in\mathbb{R}\), let us consider the function \(f:\mathbb{R}^n \rightarrow \mathbb{R}\) given by

\[f(x) = \frac12 \, x^{\top} \!\! A \, x - b^{\top} \! x + c \quad\text{for all}\quad x\in\mathbb{R}^n\]

and the unconstrained-optimization problem

\[\tag{$*$} \min_{x\in\mathbb{R}^n} f(x) \, .\]

Exercise 4a)

Exercise

Assume that \(A\) is indefinite. Show that the problem has no solution.

Since \(\nabla^2f = A\), and \(A\) is indefinite, none of the critical points are local minima, and function doesn’t have a minimum.

More directly, since \(A\) is indefinite, there exists \(x\in\mathbb{R}^n\) such that \(\mu=-x^{\top}A x > 0\) and, thus,

\[f(tx) = -\frac{\mu}{2} \, t^2 - t \, (b^{\top} \! x) + c\]

is not bounded from below with respect to \(t\in\mathbb{R}\). As a result, the infimum of \(f\) on \(\mathbb{R}^n\) is not attained and problem (\(*\)) has no solution.

Exercise 4b)

Exercise

Let \(A\) be positive definite. Show that the problem is uniquely solvable. Obtain the solution for given \(A\), \(b\) and \(c\).

Differentiation yields

\[\nabla f (x) = A x - b \quad\text{and}\quad \nabla^2 f(x) = A\]

for all \(x\in\mathbb{R}^n\).

Since \(A\) is positive definite, \(f\) is strongly convex with a positive parameter, and the first-order necessary condition \(Ax=b\) is sufficient for \(x\) to be the solution of problem (\(*\)). Every positive-definite matrix is invertible, so the solution is unique and equals \(x=A^{-1}b\).

Exercise 4c)

Exercise

When \(A\) is positive semi-definite but not positive definite, find an example where the problem \((*)\) admits

  • no solution;

  • several solutions.

First case: The matrix \(A=0\) is positive semi-definite, but then problem \((*)\) with \(b\neq 0\) has no solution since \(f\) is not bounded from below.

Second case: The matrix

\[\begin{split}A = \begin{pmatrix} 1 & 1 \\ 1 & 1 \\ \end{pmatrix}\end{split}\]

is positive semi-definite, but then the solution set of problem \((*)\) with \(b=0\) is

\[\big\{x\in\mathbb{R}^2: \; x_1+x_2=0\big\} \, .\]

Exercise 5

Exercise

For \(m\in\mathbb{N}\) such that \(m \geq n\), a matrix \(M\in\mathbb{R}^{m \times n}\) and a vector \(u\in\mathbb{R}^m\), consider the least-squares problem

\[\tag{$**$} \min_{x\in\mathbb{R}^n} \|Mx-u\|_2 \, .\]

Exercise 5a)

Exercise

Demonstrate that problem \((**)\) is a problem of type \((*)\) defined in Exercise 4.

What is the matrix \(A\) and the vectors \(b,c\) in this case?

The problem \((**)\) is equivalent to minimizing

\[\|Mx-u\|^2_2 = (Mx-u)^\top (Mx-u) = x^\top M^\top Mx-2 u^\top M x+u^\top u\]

That is, problem \((*)\) with \(A=2M^\top M\), \(b=2M^\top u\) and \(c=u^\top u\).

Exercise 5b)

Exercise

Show that the matrix \(A\) of part a) is always positive semidefinite. Show that problem \((**)\) always admits a solution.

(Remark: this is not a contradiction to 4c) )

\(A=2M^\top M\) which is always positive semidefinite (\(x^\top M^\top M x = \|Mx\|^2\geq 0\)). The problem can be reprased as \(\min_{y\in \operatorname{im}M}\|y-u\|\). In exercise 3 of week 1 we showed this problem has a unique solution granted \(\operatorname{im}M\) is convex and closed, but any linear subspace is convex and closed.

This doesn’t contradict problem 4c), because here the vector \(b\) is given, whereas in problem 4c) we were free to choose it.

Exercise 5c)

Exercise

Now suppose that \(M\) has full rank. Show that the solution is then unique, and give an explicit formula for the solution.

If \(M\) has full rank then \(M^\top M\) is in fact positive definite. The solution is therefore unique by exercise 4 b). The solution is then \(x=A^{-1}b\), i.e. \(x=(M^\top M)^{-1}M^\top u\).

Note that \(M^\dagger:=(M^\top M)^{-1}M^\top\) is the Moore-Penrose pseudoinverse of \(M\), and even if \(M\) is not of full rank we can still define this pseudoinverse such that in particular \(x=M^\dagger u\) solves \((**)\).