In order to apply these methods to a system Ax = b, A should be

• Symmetric
• Positive definite

For such cases, function f is defined such that,

$$f = \frac{1}{2} x^{T}Ax - b^{T}x+c$$

$$\nabla f = Ax-b$$

Hence, if f is minimised, then \nabla f =0 and it will make sure tat Ax = b. It can be seen too in the other way around.

$$\begin{split} \nabla f = Ax-b = -r \\ \text{ r is residual} \end{split}$$

## Steepest descent method

Solve Ax = b where,

A =
\begin{bmatrix}
1 & 0 \\
0 & 2
\end{bmatrix}
,
b = \begin{bmatrix}
1  \\
1
\end{bmatrix}

Solution:

\begin {split}
f  & =\frac{1}{2} x^{T}Ax - b^{T}x+c\\
& =\frac{1}{2}
\begin{bmatrix}
x_{1}  & x_{2}
\end{bmatrix}
\begin{bmatrix}
1 & 0 \\
0 & 2
\end{bmatrix}
\begin{bmatrix}
x_{1} \\ x_{2}
\end{bmatrix}
-
\begin{bmatrix}
1 & 1
\end{bmatrix}
\begin{bmatrix}
x_{1} \\ x_{2}
\end{bmatrix}
+c\\

&=
\frac{1}{2}
\begin{bmatrix}
x_{1}  & x_{2}
\end{bmatrix}
\begin{bmatrix}
x_{1} \\ 2x_{2}
\end{bmatrix}
-
\begin{bmatrix}
x_{1}  + x_{2}
\end{bmatrix}
+c\\
&=
\frac{1}{2}x_{1} ^{2}+x_{2} ^{2} - x_{1} - x_{2}+c\\
f=0 \rightarrow &\frac{1}{2}(x_{1} ^{2} - 2x_{1}+1)+(x_{2} ^{2} - 2x_{2}.\frac{1}{2}+(\frac{1}{2})^{2})+c - \frac{3}{4}=0\\
&\rightarrow  \frac{(x_{1} -1)^{2}}{2}+(x_{2}-1)^{2}=\frac{3}{4}-c \text{.  If }f \text{ passing through origin, c=0}\\
&\rightarrow  \frac{(x_{1} -1)^{2}}{(\sqrt{\frac{3}{2}})^{2}}+\frac{(x_{2} -\frac{1}{2})^{2}}{(\sqrt{\frac{3}{4}})^{2}} = 1\\
\end {split}


It is an ellipse with centre (1, \frac{1}{2}) and pass through origin. Select the initial guess as origin. x^{0} = (0, 0) and the solution is the centre of the ellipse which is x^{c} = (1, \frac{1}{2})

$$x^{1} = x^{0}+\alpha_{0} r_{0}$$

Where r_{0} is the residual which indicate the first search direction and \alpha_{0} is the search distance.

$$f(x^{1}) = \frac{1}{2} (x^{0}+\alpha_{0} r_{0})^{T}A(x^{0}+\alpha_{0} r_{0}) - b^{T}(x^{0}+\alpha_{0} r_{0})+c$$

For f(x^{1}) to be minimum, its derivative with respect to \alpha_{0} shall be zero.

\frac{1}{2} r_{0}^{T}A(x^{0}+\alpha_{0} r_{0})+ \frac{1}{2}(x^{0}+\alpha_{0} r_{0})^{T}Ar_{0} - b^{T}r_{0}=0
$$\alpha_{0}r_{0}^{T}Ar_{0}+\frac{1}{2}r_{0}^{T}Ax_{0}+\frac{1}{2}x_{0}^{T}Ar_{0}- b^{T}r_{0}=0$$

But we have

\begin {split}
x_{0}^{T}Ar_{0} & =(x_{0}^{T}Ar_{0})^{T}\\
& =  (Ar_{0})^{T}x_{0}^{T}\\
& =r_{0}^{T}A^{T}x_{0}\\
& =r_{0}^{T}Ax_{0}\\
(b^{T}r_{0})^{T}& =r_{0}^{T}b
\end {split}

using the above, (6) becomes,

\alpha_{0}r_{0}^{T}Ar_{0}+r_{0}^{T}\overbrace{(Ax_{0}-b)}^{-r_{0}}=0

\alpha_{0}r_{0}^{T}Ar_{0}+r_{0}^{T}{(Ax_{0}-b)}=0
$$\alpha_{0}=\frac{r_{0}^{T}r_{0}}{r_{0}^{T}Ar_{0}}$$
\begin{split}
x^{1} = x^{0}+\alpha_{0} r_{0}\\
\alpha_{1}=\frac{r_{1}^{T}r_{1}}{r_{1}^{T}Ar_{1}}\\
x^{2} = x^{1}+\alpha_{1} r_{1}\\
.......................
\end{split}

The direction of r_{1}is perpendicular to r_{0}. Hence we will be moving in step like fashion to reach the final solution x_{c}.

p_{0}=r_{0}.

$$p_{1} = k (x_{c} - x^{1})$$

\begin {split}

p_{1} & =k (x_{c} - x^{1})\\
A p_{1} & =k (Ax_{c} - Ax^{1})\\
& =k (b - Ax^{1})\\
& =k r^{1} \\

r_{0}^{T}A p_{1} & =k r_{0}^{T} r^{1} \\
& =0\\
p_{0}^{T}A p_{1} & =0

\end {split}

p_{0} is A orthogonal to new direction p_{1}.

$$p_{0}^{T}A p_{1} =0$$

Now we can write p_{1} as linear combination of r_{0} and r_{1}.

$$p_{1} = r_{1} - \beta _{1} p_{0}$$

substituting (10) in (9),

$$\beta _{1} = \frac{p_{0}^{T}A r_{1} }{p_{0}^{T}A p_{0} }$$
$$x^{2} = x^{1}+\alpha_{1} p_{1}$$

What is \alpha_{1}?. It should be such that f is minimum.

$$f(x^{2}) = \frac{1}{2} (x^{1}+\alpha_{1} p_{1})^{T}A(x^{1}+\alpha_{1} p_{1}) - b^{T}(x^{1}+\alpha_{1} p_{1})+c$$

equating the partial derivative of f with \alpha_{1}to zero

\frac{1}{2} p_{1}^{T}A(x^{1}+\alpha_{1} p_{1})+ \frac{1}{2}(x^{1}+\alpha_{1} p_{1})^{T}Ap_{1} - b^{T}p_{1}=0
$$\alpha_{1}p_{1}^{T}Ap_{1}+\frac{1}{2}p_{1}^{T}Ax_{1}+\frac{1}{2}x_{1}^{T}Ap_{1}- b^{T}p_{1}=0$$
\alpha_{1}p_{1}^{T}Ap_{1}+p_{1}^{T}\overbrace{(Ax_{1}-b)}^{-r_{1}}=0
$$\alpha_{1}=\frac{p_{1}^{T}r_{1}}{p_{1}^{T}Ap_{1}}$$

Continuing the example…

Taking x^{0} = (0,0)

\begin {split}

r_{0} & = b - A x^{0} = b =
\begin{bmatrix}
1  \\
1
\end{bmatrix} -
\begin{bmatrix}
1 & 0 \\
0 & 2
\end{bmatrix}
\begin{bmatrix}
0  \\
0
\end{bmatrix}
= \begin{bmatrix}
1  \\
1
\end{bmatrix}
\\
r_{0}^{T}r_{0} & =
\begin{bmatrix}
1  &
1
\end{bmatrix}
\begin{bmatrix}
1  \\
1
\end{bmatrix}
=2\\
r_{0}^{T}Ar_{0} & =
\begin{bmatrix}
1  &
1
\end{bmatrix}
\begin{bmatrix}
1 & 0 \\
0 & 2
\end{bmatrix}
\begin{bmatrix}
1  \\
1
\end{bmatrix}
=3\\

\alpha_{0}& =\frac{r_{0}^{T}r_{0}}{r_{0}^{T}Ar_{0}}=\frac{2}{3}\\

x^{1} & = x^{0}+\alpha_{0} r_{0} = \begin{bmatrix}
\frac{2}{3}  \\
\frac{2}{3}
\end{bmatrix} \\

r_{1} &= b - Ax^{1} =
\begin{bmatrix}
1  \\
1
\end{bmatrix}
-
\begin{bmatrix}
1 & 0 \\
0 & 2
\end{bmatrix}
\begin{bmatrix}
\frac{2}{3}  \\
\frac{2}{3}
\end{bmatrix} =
\begin{bmatrix}
\frac{1}{3}  \\
\frac{-1}{3}
\end{bmatrix}

\end {split}

Steepest descent continues to calculate \alpha_{1}, x^{2}, r_{2} ...

Note that p_{0} = r_{0}

\begin {split}
\beta _{1} &=  \frac{p_{0}^{T}A r_{1} }{p_{0}^{T}A p_{0} } \\
p_{0}^{T}A r_{1} &=  \begin{bmatrix}
1  &
1
\end{bmatrix}
\begin{bmatrix}
1 & 0 \\
0 & 2
\end{bmatrix}
\begin{bmatrix}
\frac{1}{3}  \\
\frac{-1}{3}
\end{bmatrix}
=
\begin{bmatrix}
1  &
1
\end{bmatrix}
\begin{bmatrix}
\frac{1}{3}  \\
-\frac{2}{3}
\end{bmatrix}
=   -\frac{1}{3}
\\
p_{0}^{T}A p_{0} &= r_{0}^{T}A r_{0} =3\\
\beta _{1} &= -\frac{1}{9}\\

p_{1} &= r_{1} - \beta _{1} p_{0}\\
&= \begin{bmatrix}
\frac{1}{3}  \\
\frac{-1}{3}
\end{bmatrix}
-
\frac{1}{9}\begin{bmatrix}
1  \\
1
\end{bmatrix}\\
&=\begin{bmatrix}
\frac{4}{9}  \\
-\frac{2}{9}
\end{bmatrix} \\

\alpha_{1}&=\frac{p_{1}^{T}r_{1}}{p_{1}^{T}Ap_{1}}\\
&=\frac{\begin{bmatrix}
\frac{4}{9}  &
-\frac{2}{9}
\end{bmatrix}  \begin{bmatrix}
\frac{1}{3}  \\
\frac{-1}{3}
\end{bmatrix} }

{ \begin{bmatrix}
\frac{4}{9}  &
-\frac{2}{9}
\end{bmatrix}

\begin{bmatrix}
1 & 0 \\
0 & 2
\end{bmatrix}

\begin{bmatrix}
\frac{4}{9}  \\
-\frac{2}{9}
\end{bmatrix} }\\

&=\frac{\frac{2}{9}}

{ \begin{bmatrix}
\frac{4}{9}  &
-\frac{2}{9}
\end{bmatrix}

\begin{bmatrix}
\frac{4}{9}  \\
-\frac{4}{9}
\end{bmatrix} }\\

&=\frac{\frac{2}{9}}

{ \frac{8}{27}}\\

&=\frac{3}{4}\\

x^{2} &= x^{1}+\alpha_{1} p_{1}\\
&=  \begin{bmatrix}
\frac{2}{3}  \\
\frac{2}{3}
\end{bmatrix}
+
\frac{3}{4}
\begin{bmatrix}
\frac{4}{9}  \\
-\frac{2}{9}
\end{bmatrix} \\
&=  \begin{bmatrix}
1  \\
\frac{1}{2}
\end{bmatrix}

\end {split}

reached the solution.