Gradient searching algorithms

In order to apply these methods to a system $Ax = b$ , A should be

Symmetric
Positive definite

For such cases, function f is defined such that,

\begin{equation} f = \frac{1}{2} x^{T}Ax - b^{T}x+c \end{equation}

Then the gradient of f

\begin{equation} \nabla f = Ax-b \end{equation}

Hence, if f is minimised, then $\nabla f =0$ and it will make sure tat $Ax = b$ . It can be seen too in the other way around.

\begin{equation} \begin{split} \nabla f = Ax-b = -r \\ \text{ r is residual} \end{split} \end{equation}

Steepest descent method

Solve $Ax = b$ where,

A = \begin{bmatrix} 1 & 0 \\ 0 & 2 \end{bmatrix} , b = \begin{bmatrix} 1 \\ 1 \end{bmatrix}

Solution:

\begin {split} f & =\frac{1}{2} x^{T}Ax - b^{T}x+c\\ & =\frac{1}{2} \begin{bmatrix} x_{1} & x_{2} \end{bmatrix} \begin{bmatrix} 1 & 0 \\ 0 & 2 \end{bmatrix} \begin{bmatrix} x_{1} \\ x_{2} \end{bmatrix} - \begin{bmatrix} 1 & 1 \end{bmatrix} \begin{bmatrix} x_{1} \\ x_{2} \end{bmatrix} +c\\ &= \frac{1}{2} \begin{bmatrix} x_{1} & x_{2} \end{bmatrix} \begin{bmatrix} x_{1} \\ 2x_{2} \end{bmatrix} - \begin{bmatrix} x_{1} + x_{2} \end{bmatrix} +c\\ &= \frac{1}{2}x_{1} ^{2}+x_{2} ^{2} - x_{1} - x_{2}+c\\ f=0 \rightarrow &\frac{1}{2}(x_{1} ^{2} - 2x_{1}+1)+(x_{2} ^{2} - 2x_{2}.\frac{1}{2}+(\frac{1}{2})^{2})+c - \frac{3}{4}=0\\ &\rightarrow \frac{(x_{1} -1)^{2}}{2}+(x_{2}-1)^{2}=\frac{3}{4}-c \text{. If }f \text{ passing through origin, c=0}\\ &\rightarrow \frac{(x_{1} -1)^{2}}{(\sqrt{\frac{3}{2}})^{2}}+\frac{(x_{2} -\frac{1}{2})^{2}}{(\sqrt{\frac{3}{4}})^{2}} = 1\\ \end {split}

It is an ellipse with centre $(1, \frac{1}{2})$ and pass through origin. Select the initial guess as origin. $x^{0} = (0, 0)$ and the solution is the centre of the ellipse which is $x^{c} = (1, \frac{1}{2})$

\begin{equation} x^{1} = x^{0}+\alpha_{0} r_{0} \end{equation}

Where $r_{0}$ is the residual which indicate the first search direction and $\alpha_{0}$ is the search distance.

\begin{equation} f(x^{1}) = \frac{1}{2} (x^{0}+\alpha_{0} r_{0})^{T}A(x^{0}+\alpha_{0} r_{0}) - b^{T}(x^{0}+\alpha_{0} r_{0})+c \end{equation}

For $f(x^{1})$ to be minimum, its derivative with respect to $\alpha_{0}$ shall be zero.

\frac{1}{2} r_{0}^{T}A(x^{0}+\alpha_{0} r_{0})+ \frac{1}{2}(x^{0}+\alpha_{0} r_{0})^{T}Ar_{0} - b^{T}r_{0}=0

\begin{equation} \alpha_{0}r_{0}^{T}Ar_{0}+\frac{1}{2}r_{0}^{T}Ax_{0}+\frac{1}{2}x_{0}^{T}Ar_{0}- b^{T}r_{0}=0 \end{equation}

But we have

\begin {split} x_{0}^{T}Ar_{0} & =(x_{0}^{T}Ar_{0})^{T}\\ & = (Ar_{0})^{T}x_{0}^{T}\\ & =r_{0}^{T}A^{T}x_{0}\\ & =r_{0}^{T}Ax_{0}\\ (b^{T}r_{0})^{T}& =r_{0}^{T}b \end {split}

using the above, (6) becomes,

\alpha_{0}r_{0}^{T}Ar_{0}+r_{0}^{T}\overbrace{(Ax_{0}-b)}^{-r_{0}}=0

\alpha_{0}r_{0}^{T}Ar_{0}+r_{0}^{T}{(Ax_{0}-b)}=0

\begin{equation} \alpha_{0}=\frac{r_{0}^{T}r_{0}}{r_{0}^{T}Ar_{0}} \end{equation}

\begin{split} x^{1} = x^{0}+\alpha_{0} r_{0}\\ \alpha_{1}=\frac{r_{1}^{T}r_{1}}{r_{1}^{T}Ar_{1}}\\ x^{2} = x^{1}+\alpha_{1} r_{1}\\ ....................... \end{split}

The direction of $r_{1}$ is perpendicular to $r_{0}$ . Hence we will be moving in step like fashion to reach the final solution $x_{c}$ .

Conjugate gradient method

$p_{0}=r_{0}$ .

\begin{equation} p_{1} = k (x_{c} - x^{1}) \end{equation}

\begin {split} p_{1} & =k (x_{c} - x^{1})\\ A p_{1} & =k (Ax_{c} - Ax^{1})\\ & =k (b - Ax^{1})\\ & =k r^{1} \\ r_{0}^{T}A p_{1} & =k r_{0}^{T} r^{1} \\ & =0\\ p_{0}^{T}A p_{1} & =0 \end {split}

$p_{0}$ is A orthogonal to new direction $p_{1}$ .

\begin{equation} p_{0}^{T}A p_{1} =0 \end{equation}

Now we can write $p_{1}$ as linear combination of $r_{0}$ and $r_{1}$ .

\begin{equation} p_{1} = r_{1} - \beta _{1} p_{0} \end{equation}

substituting (10) in (9),

\begin{equation} \beta _{1} = \frac{p_{0}^{T}A r_{1} }{p_{0}^{T}A p_{0} } \end{equation}

\begin{equation} x^{2} = x^{1}+\alpha_{1} p_{1} \end{equation}

What is $\alpha_{1}$ ?. It should be such that $f$ is minimum.

\begin{equation} f(x^{2}) = \frac{1}{2} (x^{1}+\alpha_{1} p_{1})^{T}A(x^{1}+\alpha_{1} p_{1}) - b^{T}(x^{1}+\alpha_{1} p_{1})+c \end{equation}

equating the partial derivative of $f$ with $\alpha_{1}$ to zero

\frac{1}{2} p_{1}^{T}A(x^{1}+\alpha_{1} p_{1})+ \frac{1}{2}(x^{1}+\alpha_{1} p_{1})^{T}Ap_{1} - b^{T}p_{1}=0

\begin{equation} \alpha_{1}p_{1}^{T}Ap_{1}+\frac{1}{2}p_{1}^{T}Ax_{1}+\frac{1}{2}x_{1}^{T}Ap_{1}- b^{T}p_{1}=0 \end{equation}

\alpha_{1}p_{1}^{T}Ap_{1}+p_{1}^{T}\overbrace{(Ax_{1}-b)}^{-r_{1}}=0

\begin{equation} \alpha_{1}=\frac{p_{1}^{T}r_{1}}{p_{1}^{T}Ap_{1}} \end{equation}

Continuing the example…

Taking $x^{0} = (0,0)$

\begin {split} r_{0} & = b - A x^{0} = b = \begin{bmatrix} 1 \\ 1 \end{bmatrix} - \begin{bmatrix} 1 & 0 \\ 0 & 2 \end{bmatrix} \begin{bmatrix} 0 \\ 0 \end{bmatrix} = \begin{bmatrix} 1 \\ 1 \end{bmatrix} \\ r_{0}^{T}r_{0} & = \begin{bmatrix} 1 & 1 \end{bmatrix} \begin{bmatrix} 1 \\ 1 \end{bmatrix} =2\\ r_{0}^{T}Ar_{0} & = \begin{bmatrix} 1 & 1 \end{bmatrix} \begin{bmatrix} 1 & 0 \\ 0 & 2 \end{bmatrix} \begin{bmatrix} 1 \\ 1 \end{bmatrix} =3\\ \alpha_{0}& =\frac{r_{0}^{T}r_{0}}{r_{0}^{T}Ar_{0}}=\frac{2}{3}\\ x^{1} & = x^{0}+\alpha_{0} r_{0} = \begin{bmatrix} \frac{2}{3} \\ \frac{2}{3} \end{bmatrix} \\ r_{1} &= b - Ax^{1} = \begin{bmatrix} 1 \\ 1 \end{bmatrix} - \begin{bmatrix} 1 & 0 \\ 0 & 2 \end{bmatrix} \begin{bmatrix} \frac{2}{3} \\ \frac{2}{3} \end{bmatrix} = \begin{bmatrix} \frac{1}{3} \\ \frac{-1}{3} \end{bmatrix} \end {split}

Steepest descent continues to calculate $\alpha_{1}, x^{2}, r_{2} ...$

Conjugate gradient method:

Note that $p_{0} = r_{0}$

\begin {split} \beta _{1} &= \frac{p_{0}^{T}A r_{1} }{p_{0}^{T}A p_{0} } \\ p_{0}^{T}A r_{1} &= \begin{bmatrix} 1 & 1 \end{bmatrix} \begin{bmatrix} 1 & 0 \\ 0 & 2 \end{bmatrix} \begin{bmatrix} \frac{1}{3} \\ \frac{-1}{3} \end{bmatrix} = \begin{bmatrix} 1 & 1 \end{bmatrix} \begin{bmatrix} \frac{1}{3} \\ -\frac{2}{3} \end{bmatrix} = -\frac{1}{3} \\ p_{0}^{T}A p_{0} &= r_{0}^{T}A r_{0} =3\\ \beta _{1} &= -\frac{1}{9}\\ p_{1} &= r_{1} - \beta _{1} p_{0}\\ &= \begin{bmatrix} \frac{1}{3} \\ \frac{-1}{3} \end{bmatrix} - \frac{1}{9}\begin{bmatrix} 1 \\ 1 \end{bmatrix}\\ &=\begin{bmatrix} \frac{4}{9} \\ -\frac{2}{9} \end{bmatrix} \\ \alpha_{1}&=\frac{p_{1}^{T}r_{1}}{p_{1}^{T}Ap_{1}}\\ &=\frac{\begin{bmatrix} \frac{4}{9} & -\frac{2}{9} \end{bmatrix} \begin{bmatrix} \frac{1}{3} \\ \frac{-1}{3} \end{bmatrix} } { \begin{bmatrix} \frac{4}{9} & -\frac{2}{9} \end{bmatrix} \begin{bmatrix} 1 & 0 \\ 0 & 2 \end{bmatrix} \begin{bmatrix} \frac{4}{9} \\ -\frac{2}{9} \end{bmatrix} }\\ &=\frac{\frac{2}{9}} { \begin{bmatrix} \frac{4}{9} & -\frac{2}{9} \end{bmatrix} \begin{bmatrix} \frac{4}{9} \\ -\frac{4}{9} \end{bmatrix} }\\ &=\frac{\frac{2}{9}} { \frac{8}{27}}\\ &=\frac{3}{4}\\ x^{2} &= x^{1}+\alpha_{1} p_{1}\\ &= \begin{bmatrix} \frac{2}{3} \\ \frac{2}{3} \end{bmatrix} + \frac{3}{4} \begin{bmatrix} \frac{4}{9} \\ -\frac{2}{9} \end{bmatrix} \\ &= \begin{bmatrix} 1 \\ \frac{1}{2} \end{bmatrix} \end {split}

reached the solution.

Steepest descent method

Conjugate gradient method

Leave a Comment Cancel Reply