Solving Systems of Linear Equations (1)

Least Squares Analysis

This article discusses solving one particular case of systems of linear equations, namely considering the system

Ax=b

where $A\in R^{m*n},b\in R^{m},m\ge n,rank(A)=n$ . In this case, the number of unknowns is smaller than the number of equations, so it is very likely that $b$ does not lie in the range space of $A$ , meaning the system has no solution. However, in this situation we can obtain the least squares solution of this linear system, that is, there exists $x^{*}$ such that for all $x\in R^{n}$ we have

||Ax-b||^{2}\ge ||Ax^{*}-b||^{2}

We then call $x^{*}$ the least squares solution of this linear system. When $b$ lies in the solution space of $A$ , $x^{*}$ is naturally the solution of the system. The least squares solution can be computed directly via the following formula:

x^{*}=(A^{T}A)^{-1}A^{T}b

The proof is as follows. Construct the objective function

f(x)=||Ax-b||^{2}\\ =(Ax-b)^{T}(Ax-b)\\ =\frac{1}{2}x^{T}(2A^{T}A)x-x^{T}(2A^{T}b)+b^{T}b

Clearly the function $f$ is a quadratic form, and since $rank(A)=n$ , this quadratic form is positive definite. Using the first-order necessary condition for a local minimizer, we find that the minimizer satisfies

\nabla f(x)=2A^{T}Ax-2A^{T}b=0

The unique solution of this equation is $x^{*}=(A^{T}A)^{-1}A^{T}b$ , which is the least squares solution. The least squares method is a very convenient algorithm for applications such as line fitting.

Recursive Least Squares

The previous section introduced the least squares method, which we can use for line fitting. If we want to add several groups of data to the fitted data, we can do so via the recursive least squares method, that is, by making partial corrections based on the previous fitting result. In other words, we use the least squares solution $x^{*}$ obtained from the previous fit to derive the least squares solution $x^{*}$ after the data points have been added.

For a given optimization problem, in order to find a suitable $x$ that minimizes $||A_{0}x-b^{(0)}||^{2}$ , the solution to this problem is known to be $x^{(0)}=G_{0}^{-1}A_{0}^{T}b^{(0)}$ , where $G_{0}=A^{T}_{0}A_{0}$ . If new data is added, represented by the matrix $A_{1}$ and the vector $b_{1}$ , then the problem becomes finding $x$ that minimizes

|| \begin{bmatrix} A_{0}\\A_{1} \end{bmatrix}x-\begin{bmatrix} b^{(0)}\\b^{(1)} \end{bmatrix}||^{2}

Its iterative formula is:

G_{1}=G_{0}+A_{1}^{T}A_{1}\\ x^{(1)}=x^{(0)}+G_{1}^{-1}A_{1}^{T}(b^{(1)}-A_{1}x^{(0)})

The proof is omitted for now. In the general case, the iterative formula is:

G_{k+}=G_{k}+A_{k+1}^{T}A_{k+1}\\ x^{(k+1)}=x^{(k)}+G_{k+1}^{-1}A_{k+1}^{T}(b^{(k+1)}-A_{k+1}x^{(k)})

For solving systems of linear equations, we will later introduce the minimum-norm solution and the general solution method, as well as the pseudoinverse and related topics. To be continued…

Technology

2018 · 07 · 20