Conditioning of Linear Equation Systems
Conditioning for Ax = b
Input \(A,b\), output \(x\)
Assume \(A_{n\times n}\) is non-singular (square matrix), suppose changing \(b\rightarrow \hat b\), denote \(\Delta b = b - \hat b\).
Consider \(\Delta x = x - \hat x\), where \(A\hat x = \hat b\). Having \(\frac{\|\Delta b\|}{\|b\|}\), WTF \(\frac{\|\Delta x\|}{\|x\|}\)
So that
Then, using the two inequalities
Define \(\text{Cond}(A) = \|A\|\|A^{-1}\|\)
Also, note that \(\text{Cond}(A)^{-1} \frac{\|\Delta b\|}{\|b\|} < \frac{\|\Delta\|}{\|x\|} \leq \text{Cond}(A)\frac{\Delta b\|}{\|b\|}\)
And \(\text{Cond}(A)^{-1} \leq \text{Cond}(A)\implies \text{Cond}(A)^2 \geq 1\implies \text{Cond}(A) \geq 1\)
Also, noting that the lemma is only true if the norm induced by vector norm
Theorem \(\forall A\in \mathbb R^n \times \mathbb R^n. \text{Cond}(A) \geq 1\) (Another proof)
proof. \(1 = \|I\| = \|A^{-1}A\| \leq \|A^{-1}\|\|A\| = \text{Cond}(A)\) by triangular inequality of the norm
Theorem \(\frac{\|\Delta x\|}{\|x\|} \leq \text{Cond}(A)\big(\frac{\|\Delta b\|}{\|b\|} + \frac{\|\Delta A\|}{\|A\|}\big)\)
For \(2\times 2\) matrices
Example
Given \(A = \begin{bmatrix}3&-1\\-2&1\end{bmatrix}\) with \(\|\cdot\|_\infty\)
Then, \(A^{-1} = \begin{bmatrix}1&1\\2&3\end{bmatrix}\), \(\|A\|_\infty = \max(3 + 1, 2 + 1)=4, \|A^{-1}\| = 5\)
Therefore, \(\text{Cond}(A) = 4\times 5 = 20\), which is good-conditioning.
Define \(y:= A^{-1}x\) so that \(x\neq 0 \Leftrightarrow y\neq 0\)
So that
Is the ratio between how much \(A\) expands \(x\) and how much \(A\) contracts \(x\)
Properties of Conditioning Numbers
\(\text{Cond}(A)\geq 1\)
\(\text{Cond}(I) = \|I\|\|I^{-1}\| = 1\)
\(\text{Cond}(\gamma A) = |\gamma|\|A\||\gamma^{-1}|\|A^{-1}\| = \text{Cond}(A)\)
Diagonal matrices
If \(D\) is a diagonal matrix of \(diag = \{d_1, d_2,...,d_n\}, di\neq 0\), then
Near singular
If changing some elements of \(A\) by \(A+E\), then it will be singular.
So that the larger conditioning number indicates that \(A\) is closer to be near singular
\(\det(A) = 0 \iff A\) is singular,
If the \(\det\) is small, then it may not be near singular (\(\det(\gamma I) = \gamma\) can be arbitrarily small while still away from singular)
Estimate matrix inverse
Note that computing \(A^{-1}\) requires \(\sim n^3\) operations (by Gaussian elimination). To estimate \(\|A^{-1}\|\) for computing \(\text{Cond}(A)\)
Therefore, choose a sequence of \(z\)'s to try to make \(\frac{\|z\|}{\|y\|}\) as large as possible (iterative estimating)
Rounding errors
Define \(\Delta A = A - \hat A, \Delta b = b - \hat b\), then \(\frac{\|\Delta A\|}{\|A\|}, \|\Delta b\| \approx \epsilon_{mach}\). Then
Consider \(Ax=b\) computed to get \(\hat x\), define residual \(r := b - A\hat x = b - \hat b = \Delta b\)
Define \(\Delta x= x - \hat x\), then,
Nearly all good linear equation solvers have small \(\|r\|\) so it's only a matter of the conditioning number