# Week 4¶

## Exercise 1¶

Exercise

Let $$g:\mathbb{R}\rightarrow \mathbb{R}$$ be a convex, monotonically non-decreasing function and let $$f:C\subset\mathbb{R}^n\rightarrow \mathbb{R}$$ be a convex function defined on the convex set $$C\subset \mathbb R^n$$.

### Exercise 1a)¶

Exercise

Show that the composition $$h(x)=g(f(x))$$ is a convex function.

Recall that $$g$$ non-decreasing means that if $$x\leq y$$ then $$g(x)\leq g(y)$$.

Let $$x,y\in C$$ and consider $$\lambda\in[0,1]$$. By convexity of $$f$$:

$f(\lambda x+(1 -\lambda )y)\leq \lambda f(x)+(1-\lambda)f(y).$

Since $$g$$ is non-decreasing:

\begin{split}\begin{align} g\left(f(\lambda x+(1 -\lambda )y)\right) &\leq g\left(\lambda f(x)+(1-\lambda)f(y)\right)\\ &\leq \lambda g(f(x))+(1-\lambda)g(f(y)), \end{align}\end{split}

where in the last step we used the convexity of $$g$$. This tells us precisely that $$h=g\circ f$$ is convex.

### Exercise 1b)¶

Exercise

Use the previous result to show that the function

$h\colon \mathbb R^n\to \mathbb R,\qquad x\mapsto \exp\left(\beta x^TQx\right),$

where $$\beta>0$$ and $$Q$$ symmetric positive definite matrix, is a convex function

Since $$Q$$ is symmetric positive definite so is $$\beta Q$$, and hence the map $$x\mapsto \beta x^T Q x$$ is convex. Not that $$\exp$$ is strictly increasing, and hence in particular non-decreasing. Therefore by a) we have that $$h$$ is convex.

### Exercise 1c)¶

Exercise

If we ask $$g$$ to be just convex the result of a) could be no longer valid. Find an example of two convex functions $$g, f$$ such that their composition is not convex.

Take any function $$f$$ which is convex, but not concave, e.g. $$f(x) = x^2$$. Then pick $$g(x) = -x$$. We then have that $$h=g\circ f$$ is concave but not convex.

## Exercise 2¶

Exercise

Let $$\alpha_1,\alpha_2,\ldots,\alpha_n>0$$ be scalars such that $$\sum_{i=1}^n\alpha_i=1$$.

Show that for all $$(x_1,\dots,x_n)\in \mathbb R^n$$ we have the following inequality:

$x_1^{\alpha_1}x_2^{\alpha_2}\cdots x_n^{\alpha_n}\leq \alpha_1 x_1+\alpha_2 x_2+\ldots+\alpha_n x_n\,.$

Hint: Use $$\log$$.

If $$x_i=0$$ for some $$i$$, then the inequality is trivial. Assume therefore that $$x_i\neq 0$$ for all $$i$$.

Recall that $$\log$$ is concave, therefore:

\begin{split}\begin{align} \log\left(\sum_{i=1}^n\alpha_i x_i\right)&\geq \sum_{i=1}^n\alpha_i \log(x_i)\\ &=\sum_{i=1}^n\log(x_i^{\alpha_i})\\ &=\log\left(\prod_{i=1}^n x_i^{\alpha_i}\right) \end{align}\end{split}

Since $$\exp$$ is monotonically increasing, we can apply it on both sides of the inequality to obtain the desired result.

## Exercise 3¶

Exercise

Let $$C\subset\mathbb{R}^n$$ be closed and convex, and let $$\left\|\cdot\right\|$$ be the Euclidean norm.

In Homework 1.3a) you proved that for every $$x\in \mathbb{R}^n$$ there exists a unique point $$z\in C$$ that minimizes the distance $$\left\|x-z\right\|$$. Let us define

$\mathcal P_C\colon \mathbb R^n\to C,\qquad\mathcal P_C(x)=\operatorname{argmin}_{z\in C}\left\|z-x\right\|.$

That is, $$\mathcal P_C$$ is the projection onto $$C$$, giving the closest approximation of $$x$$ by any element in $$C$$.

### Exercise 3a)¶

Exercise

Show that for $$x\in\mathbb{R}^n$$ a point $$z\in C$$ satisfies $$z = \mathcal P_C(x)$$ if and only if

$(y-z)^T(x-z)\leq 0\,,\quad \forall y\in C\,;$

Notice that the function $$f(z)=\frac12\left\|z-x\right\|^2$$, $$z\in C$$ is convex. Then, from the sufficient and necessary optimality conditions for convex optimization, a point $$z\in C$$ is a minimizer of $$f$$ if and only if

$\nabla f(z)^T(y-z)\geq 0\,,\quad \forall y\in C\,.$

Since $$\nabla f(z)=(z-x)$$, and $$f$$ clearly has the same minimizer as $$z\mapsto \|z-x\|$$, we obtain the result.

### Exercise 3b)¶

Exercise

Show that $$\mathcal P_C$$ is continuous and non-expansive. That is show that the following inequality holds:

$\left\|\mathcal P_C(x)-\mathcal P_C(y)\right\|\leq \left\|x-y\right\|\,,\quad \forall x,y\in\mathbb{R}^n.$

By the Cauchy-Schwarz inequality it is enough to show that

$\|\mathcal P_C(x)-\mathcal P_C(Y)\|^2 \leq \left\langle\mathcal P_C(x)-\mathcal P_C(y),\,x-y \right\rangle$

This is because CS inequality tells us that

$\left\langle\mathcal P_C(x)-\mathcal P_C(y),\,x-y \right\rangle\leq \left\|\mathcal P_C(x)-\mathcal P_C(y)\right\|\left\|x-y\right\|.$

Using the optimality condition of 3a) with $$x=x$$, $$y=\mathcal P_C(y)$$ and $$z=\mathcal P_C(x)$$, we obtain

$\left\langle\mathcal P_C(y)-\mathcal P_C(x),\,x-\mathcal P_C(x)\right\rangle\leq 0.$

Reversing the roles of $$x$$ and $$y$$ we also get

$\left\langle\mathcal P_C(x)-\mathcal P_C(y),\,y-\mathcal P_C(y) \right\rangle\leq 0.$

Adding these two inequalities we get

$\left\langle\mathcal P_C(x)-\mathcal P_C(y),\,-(x-\mathcal P_C(x))+y-\mathcal P_C(y)\right\rangle\leq 0.$

We can rearange the terms of this inequality to obtain:

$\|P_C(x)-\mathcal P_C(y)\|^2-\left\langle\mathcal P_C(x)-\mathcal P_C(y),\,x-y\right\rangle\leq 0.$

By the remark at the start of the proof this gives the desired result.

## Exercise 4¶

Exercise

Consider the function

$f\colon S_+^n\to \mathbb R,\qquad f(X) = \log \det (X),$

where $$S_+^n$$ denotes the set of $$n\times n$$ symmetric positive definite matrices. We want to show that $$f$$ is concave by showing that for each $$X,Y\in S_+^n$$ the following function is concave:

$g\colon S_+^n\to \mathbb R,\qquad g(t) = \log \det (X+tY)$

### Exercise 4a)¶

Exercise

If $$X$$ has eigenvalues $$\lambda_1,\dots,\lambda_n >0$$, then show that

$\log\det X = \sum_i\log \lambda_i$

Let $$X = QDQ^{-1}$$ be the eigenvalue decomposition of $$X$$. We then have that

\begin{split}\begin{align} \log\det X &= \log \det QDQ^{-1} = \log \det(D)\\ &=\log\prod_i\lambda_i=\sum_i\log\lambda_i \end{align}\end{split}

### Exercise 4b)¶

Exercise

Show that

$\log \det (X+tY) = \log\det(X)+\log\det(I+tZ),$

for some $$Z\in S_+^n$$.

Hint: Conjugate $$(X+tY)$$ by $$I = X^{1/2}X^{-1/2}$$, where $$X=X^{1/2}X^{1/2}$$.

We have

\begin{split}\begin{align} \log \det (X+tY) &= \log\det(X+(X^{1/2}X^{-1/2})tY(X^{-1/2}X^{1/2}))\\ &= \log\det (X^{1/2}(I +t X^{-1/2}YX^{-1/2})X^{1/2})\\ &= \log\left(\det (X^{1/2})\det(I +t X^{-1/2}YX^{-1/2})\det(X^{1/2})\right)\\ &= \log \det(X) + \log\det(I +t X^{-1/2}YX^{-1/2}) \end{align}\end{split}

Note that if $$X=QDQ^T$$ then $$X^{-1/2} = QD^{-1/2}Q^T$$. so $$X^{-1/2}$$ is symmetric. We thus have that for any nonzero $$x\in \mathbb R^n$$:

$x^T(X^{-1/2}YX^{-1/2})x = (X^{-1/2}x)^TY(X^{-1/2}x) > 0$

Moreover since $$X^{-1/2}$$ is symmetric, $$Z^T = Z$$, and we conclude that $$Z= X^{-1/2}YX^{-1/2}\in S_+^n$$ is symmetric positive definite.

Note that in general if $$X$$ is symmetric and $$A\in S_+^n$$, then $$XAX\in S_+^n$$. If $$A,B\in S_+^n$$ then $$AB\in S_+^n$$ if and only if $$AB=BA$$, so we couldn’t just take $$Z=YX^{-1}$$!

### Exercise 4c)¶

Exercise

Let $$\mu_1,\dots,\mu_n$$ be the eigenvalues of $$Z$$. Show that the eigenvalues of $$I+tZ$$ are given by $$1+t\mu_1,\dots,1+t\mu_n$$.

If $$Z = QDQ^{-1}$$ is the eigenvalue decomposition of $$Z$$, then

$I+tZ = Q(I+tD)Q^{-1}$

Hence the eigenvalues of $$I+tZ$$ are $$1+t\mu_1,\dots,1+t\mu_n$$.

The crucial property here is that $$I$$ and $$Z$$ commute. In general if $$A,B$$ symmetric and $$AB=BA$$ commute, then we can simultaneously diagonalize them: $$A = UD_1U^T, \, B=UD_2 U^T$$ with $$U$$ orthogonal and $$D_i$$ diagonal, so that $$A+B = U(D_1+D_2)U^T$$. This means that if two symmetric matrices commute, then the eigenvalues of $$A+B$$ are the sums of the respective eigenvalues.

### Exercise 4d)¶

Exercise

Show that $$g''(t) < 0$$ for all $$t\in \mathbb R$$ and conclude that $$f=\log\det$$ is concave.

We have that

$g(t) = \sum_i\log(1+t\mu_i)+\text{Constant}$

Hence:

$g'(t) = \sum_i \frac{\mu_i}{1+t\mu_i},$

and consequently,

$g''(t) =\sum_i \frac{-\mu_i^2}{(t+t\mu_i)^2} < 0$

Hence $$g(t)$$ is concave. Since this holds for all $$X,Y\in S_+^n$$ we also conclude that $$f=\log\det$$ is concave.