Week 4¶
Exercise 1¶
Exercise
Let \(g:\mathbb{R}\rightarrow \mathbb{R}\) be a convex, monotonically non-decreasing function and let \(f:C\subset\mathbb{R}^n\rightarrow \mathbb{R}\) be a convex function defined on the convex set \(C\subset \mathbb R^n\).
Exercise 1a)¶
Exercise
Show that the composition \(h(x)=g(f(x))\) is a convex function.
Recall that \(g\) non-decreasing means that if \(x\leq y\) then \(g(x)\leq g(y)\).
Let \(x,y\in C\) and consider \(\lambda\in[0,1]\). By convexity of \(f\):
Since \(g\) is non-decreasing:
where in the last step we used the convexity of \(g\). This tells us precisely that \(h=g\circ f\) is convex.
Exercise 1b)¶
Exercise
Use the previous result to show that the function
where \(\beta>0\) and \(Q\) symmetric positive definite matrix, is a convex function
Since \(Q\) is symmetric positive definite so is \(\beta Q\), and hence the map \(x\mapsto \beta x^T Q x\) is convex. Not that \(\exp\) is strictly increasing, and hence in particular non-decreasing. Therefore by a) we have that \(h\) is convex.
Exercise 1c)¶
Exercise
If we ask \(g\) to be just convex the result of a) could be no longer valid. Find an example of two convex functions \(g, f\) such that their composition is not convex.
Take any function \(f\) which is convex, but not concave, e.g. \(f(x) = x^2\). Then pick \(g(x) = -x\). We then have that \(h=g\circ f\) is concave but not convex.
Exercise 2¶
Exercise
Let \(\alpha_1,\alpha_2,\ldots,\alpha_n>0\) be scalars such that \(\sum_{i=1}^n\alpha_i=1\).
Show that for all \((x_1,\dots,x_n)\in \mathbb R^n\) we have the following inequality:
Hint: Use \(\log\).
If \(x_i=0\) for some \(i\), then the inequality is trivial. Assume therefore that \(x_i\neq 0\) for all \(i\).
Recall that \(\log\) is concave, therefore:
Since \(\exp\) is monotonically increasing, we can apply it on both sides of the inequality to obtain the desired result.
Exercise 3¶
Exercise
Let \(C\subset\mathbb{R}^n\) be closed and convex, and let \(\left\|\cdot\right\|\) be the Euclidean norm.
In Homework 1.3a) you proved that for every \(x\in \mathbb{R}^n\) there exists a unique point \(z\in C\) that minimizes the distance \(\left\|x-z\right\|\). Let us define
That is, \(\mathcal P_C\) is the projection onto \(C\), giving the closest approximation of \(x\) by any element in \(C\).
Exercise 3a)¶
Exercise
Show that for \(x\in\mathbb{R}^n\) a point \(z\in C\) satisfies \(z = \mathcal P_C(x)\) if and only if
Notice that the function \(f(z)=\frac12\left\|z-x\right\|^2\), \(z\in C\) is convex. Then, from the sufficient and necessary optimality conditions for convex optimization, a point \(z\in C\) is a minimizer of \(f\) if and only if
Since \(\nabla f(z)=(z-x)\), and \(f\) clearly has the same minimizer as \(z\mapsto \|z-x\|\), we obtain the result.
Exercise 3b)¶
Exercise
Show that \(\mathcal P_C\) is continuous and non-expansive. That is show that the following inequality holds:
By the Cauchy-Schwarz inequality it is enough to show that
This is because CS inequality tells us that
Using the optimality condition of 3a) with \(x=x\), \(y=\mathcal P_C(y)\) and \(z=\mathcal P_C(x)\), we obtain
Reversing the roles of \(x\) and \(y\) we also get
Adding these two inequalities we get
We can rearange the terms of this inequality to obtain:
By the remark at the start of the proof this gives the desired result.
Exercise 4¶
Exercise
Consider the function
where \(S_+^n\) denotes the set of \(n\times n\) symmetric positive definite matrices. We want to show that \(f\) is concave by showing that for each \(X,Y\in S_+^n\) the following function is concave:
Exercise 4a)¶
Exercise
If \(X\) has eigenvalues \(\lambda_1,\dots,\lambda_n >0\), then show that
Let \(X = QDQ^{-1}\) be the eigenvalue decomposition of \(X\). We then have that
Exercise 4b)¶
Exercise
Show that
for some \(Z\in S_+^n\).
Hint: Conjugate \((X+tY)\) by \(I = X^{1/2}X^{-1/2}\), where \(X=X^{1/2}X^{1/2}\).
We have
Note that if \(X=QDQ^T\) then \(X^{-1/2} = QD^{-1/2}Q^T\). so \(X^{-1/2}\) is symmetric. We thus have that for any nonzero \(x\in \mathbb R^n\):
Moreover since \(X^{-1/2}\) is symmetric, \(Z^T = Z\), and we conclude that \(Z= X^{-1/2}YX^{-1/2}\in S_+^n\) is symmetric positive definite.
Note that in general if \(X\) is symmetric and \(A\in S_+^n\), then \(XAX\in S_+^n\). If \(A,B\in S_+^n\) then \(AB\in S_+^n\) if and only if \(AB=BA\), so we couldn’t just take \(Z=YX^{-1}\)!
Exercise 4c)¶
Exercise
Let \(\mu_1,\dots,\mu_n\) be the eigenvalues of \(Z\). Show that the eigenvalues of \(I+tZ\) are given by \(1+t\mu_1,\dots,1+t\mu_n\).
If \(Z = QDQ^{-1}\) is the eigenvalue decomposition of \(Z\), then
Hence the eigenvalues of \(I+tZ\) are \(1+t\mu_1,\dots,1+t\mu_n\).
The crucial property here is that \(I\) and \(Z\) commute. In general if \(A,B\) symmetric and \(AB=BA\) commute, then we can simultaneously diagonalize them: \(A = UD_1U^T, \, B=UD_2 U^T\) with \(U\) orthogonal and \(D_i\) diagonal, so that \(A+B = U(D_1+D_2)U^T\). This means that if two symmetric matrices commute, then the eigenvalues of \(A+B\) are the sums of the respective eigenvalues.
Exercise 4d)¶
Exercise
Show that \(g''(t) < 0\) for all \(t\in \mathbb R\) and conclude that \(f=\log\det\) is concave.
We have that
Hence:
and consequently,
Hence \(g(t)\) is concave. Since this holds for all \(X,Y\in S_+^n\) we also conclude that \(f=\log\det\) is concave.