Solving Systems of Linear Equations (3)

The Pseudoinverse of a Matrix

The pseudoinverse introduced here is the Moore-Penrose inverse, defined as follows: given a matrix ARmnA\in R^{m*n}, if a matrix ARnmA^{\dagger}\in R^{n*m} satisfies AAA=AAA^{\dagger}A=A and there exist two matrices URnn,VRmmU\in R^{n*n},V\in R^{m*m} such that

A=UAT,A=ATVA^{\dagger}=UA^{T},A^{\dagger}=A^{T}V

then AA^{\dagger} is called the pseudoinverse of the matrix AA. It can be proven that the pseudoinverse of a matrix is unique.

For a matrix ARmn,mnA\in R^{m*n},m\ge n with rank(A)=nrank(A)=n, one can verify from the above definition that the pseudoinverse of AA is

A=(ATA)1ATA^{\dagger}=(A^{T}A)^{-1}A^{T}

For a matrix ARmn,mnA\in R^{m*n},m\le n with rank(A)=mrank(A)=m, one can likewise verify from the above definition that the pseudoinverse of AA is

A=AT(AAT)1A^{\dagger}=A^{T}(AA^{T})^{-1}

The two cases above give the pseudoinverse when the matrix has full column rank or full row rank. For a general matrix ARmn,rank(A)=r,rmin(m,n)A\in R^{m*n},rank(A)=r,r\le min(m,n), we can use the method of full-rank factorization to obtain its pseudoinverse.

Any matrix ARmn,rank(A)=r,rmin(m,n)A\in R^{m*n},rank(A)=r,r\le min(m,n) can be factored into the product of a full-row-rank matrix and a full-column-rank matrix: that is, A=BC,BRmr,cRrn,rank(A)=rank(B)=rank(C)=rA=BC,B\in R^{m*r},c\in R^{r*n},rank(A)=rank(B)=rank(C)=r

It can be proven that: A=CBA^{\dagger}=C^{\dagger}B^{\dagger}, where B=(BTB)1BT,C=CT(CCT)1B^{\dagger}=(B^{T}B)^{-1}B^{T},C^{\dagger}=C^{T}(CC^{T})^{-1}, which is how the pseudoinverse of a general matrix is computed.

Solving Systems of Linear Equations in the General Case

Consider a system of linear equations Ax=b,ARmn,rank(A)=rAx=b,A\in R^{m*n},rank(A)=r. The vector x=Abx^{*}=A^{\dagger}b minimizes Axb2||Ax-b||^{2} over the space RnR^{n}; moreover, among all vectors in RnR^{n} that minimize Axb2||Ax-b||^{2}, the vector x=Abx^{*}=A^{\dagger}b has the smallest norm and is unique.

When r=mr=m, AA has full row rank, and in this case x=Ab=AT(AAT)1bx^{*}=A^{\dagger}b=A^{T}(AA^{T})^{-1}b is the minimum-norm solution of the system Ax=bAx=b.

When r=nr=n, AA has full column rank, and in this case x=Ab=(ATA)1ATbx^{*}=A^{\dagger}b=(A^{T}A)^{-1}A^{T}b is the least-squares solution of the system Ax=bAx=b.