Skip to content

Approxmation Errors and Conditioning, Stability

import matplotlib.pyplot as plt
import numpy as np

Let A:=A:= approximation, T:=T:= true value

Absolute error and relative error

Absolute error ATA-T
Relative error ATT\frac{A-T}{T}, let AT=δA-T = \delta, then A=T(1+δ)A = T(1+\delta) avoids the issue with A=0A = 0

Connect Digits A:=5.46729×1012,T:=5.46417×1012A:= 5.46729 \times 10^{-12}, T:= 5.46417\times 10^{-12}
Then, δabsolute=AT=0.000312×1012\delta_{absolute}= A-T = 0.000312 \times 10^{-12}
δrelative=0.000312/5.46417=3.12/5.46417×103\delta_{relative} = 0.000312 / 5.46417 = 3.12/5.46417 \times 10^{-3}
Let pp be the first non-agreed digit, then the magnitude relative error will be around 10p±110^{-p\pm 1}.

0.99... = 1 However, consider A=1.00596×1010,T=0.99452×1010A= 1.00596\times 10^{-10}, T = 0.99452 \times 10^{-10}, the relative error is much smaller than the magnitude approximation. Therefore, we say that 0.999...0.999... agrees with 1.000...1.000..., hence A,TA,T agrees on 2 sig digits.

Data error and computational error

Data error let x^:=\hat x:= input value, xx be actual value, error = x^x\hat x - x, for example - we cannot represent AA as a terminating floating number (3.14π)(3.14\sim \pi) - measurement error

Computational error let ff be the actual mapping, f^\hat f be the approximation function, errors of the computation is f^(x)f(x)\hat f(x) - f(x), for example - often cos\cos is calculated by its Taylor series, while Taylor series is a infinite sum, and we can only compute by finite approximation.

Truncation error and rounding error (subclasses of computational error)

Truncation error The difference between the true result ff and the result that would be produced by a given (finite approximation) algorithm using exact arithmetic f^\hat f.

rounding error The difference between the result produced by a given algorithm using exact arithmetic, and the result produced by the same algorithm using a finite-precision, rounded arithmetic.

Example To approximate f(x)f'(x) with known ff
By definition f(x)=limh0f(x+h)f(x)hf'(x) = \lim_{h\rightarrow 0} \frac{f(x+h) - f(x)}{h}, hence, choose a small but not necessarily 0 value hh, we can approximate f(x)f'(x) for all xx.
Assuming ff is smooth, then

f(x+h)f(x)h=1h(k=0f(k)(x)hkk!f(x))Taylor expansion around x=1h(f(x)+f(x)h+f(c)h2/2f(x))Remainder theorem,c[x,x+h]=f(x)+f(c)h2\begin{align*} &\quad \frac{f(x+h)-f(x)}{h}\\ &= \frac{1}{h}(\sum_{k=0} f^{(k)}(x)\frac{h^{k}}{k!} -f(x))&\text{Taylor expansion around }x \\ &= \frac{1}{h}(f(x)+f'(x)h + f''(c)h^2/2 - f(x)) &\text{Remainder theorem}, c\in[x, x+h]\\ &= f'(x) + \frac{f''(c)h}{2} \end{align*}

Therefore, the truncation error will be f(c)h2\frac{f''(c)h}{2}, by IVT, such error is bounded since ff is continuous.

Then, let FLFL be the mapping to the floating representation result, assume

FL(f(x))=f(x)+ϵ0FL(f(x+h))=f(x+h)+ϵhFL(f(x))=f(x)+ϵ^FL(f(x+h)f(x)hf(x))=f(x+h)+ϵh(f(x)+ϵ0)h(f(x)+ϵ^)=f(c)h2+ϵhϵ0h+ϵ^Total Error=Truncation Error + Rounding Error\begin{align*} FL(f(x)) &= f(x) + \epsilon_0 \\ FL(f(x+h)) &= f(x+h) + \epsilon_h \\ FL(f'(x)) &= f'(x) + \hat\epsilon\\ \Rightarrow FL\bigg(\frac{f(x+h) - f(x)}{h} -f'(x)\bigg)&= \frac{f(x+h) + \epsilon_h - (f(x) + \epsilon_0)}{h} -(f'(x) + \hat\epsilon) \\ &= \frac{f''(c)h}{2} + \frac{\epsilon_h - \epsilon_0}{h}+ \hat\epsilon \\ \text{Total Error}&=\text{Truncation Error + Rounding Error} \end{align*}

Note that ϵh\epsilon_h is less influenced by hh when hh is small, then we can looks ϵhϵ0=:ϵ\epsilon_h -\epsilon_0 =:\epsilon as a constant.
Then, let E(h)E(h) be the total error, E(h)=f(c)/2ϵh2argmin(E(h))=ϵh/ME'(h) = f''(c)/2 - \frac{\epsilon}{h^2}\Rightarrow argmin(E(h))=\sqrt{\epsilon_h/M}

Claim limh0FL(f(x+h))FL(f(x))=0\lim_{h\rightarrow 0} FL(f(x+h)) - FL(f(x)) = 0

Notice that when hh is particularly smaller than the machine's precision, i.e. x>>hx>>h, then x+hx + h will be exactly xx.

Therefore, for very small hh, FL(f(x+h)f(x)hf(x))=FL(f(x))FL\bigg(\frac{f(x+h) - f(x)}{h} -f'(x)\bigg) = FL(-f'(x))

Forward Error and backward error

Let y=f(x)y = f(x) be the true result, y^=COMP(f(x))\hat y = COMP(f(x)) be the computational result, where COMPCOMP is all the possible errors caused in the computation. Then, forward error is y^y\hat y - y.
Take x^\hat x such that with the true computation, y^=f(x^)\hat y = f(\hat x), then backward error is x^x\hat x - x

Conditioning of Problems

Let x^(xδ,x+δ)\hat x \in (x-\delta, x+\delta), i.e. any x^\hat x within the neighborhood of some xx.
Consider the relative forward error, assume ff is differentiable around xx

y^yy:=f(x^)f(x)f(x)=f(c)f(x)(x^x)MVT, c[x,x^]=xf(c)f(x)x^xxxf(x)f(x)x^xxc is quiet near x=conditioning number×relative backward error\begin{align*} \frac{\hat y - y}{y} &:= \frac{f(\hat x) - f(x)}{f(x)} \\ &= \frac{f'(c)}{f(x)}(\hat x - x) &\text{MVT, }c\in [x, \hat x]\\ &= \frac{xf'(c)}{f(x)}\frac{\hat x - x}{x} \\ &\approx \frac{xf'(x)}{f(x)}\frac{\hat x - x}{x} &c \text{ is quiet near } x \\ &= \text{conditioning number} \times \text{relative backward error} \end{align*}

The problem is well conditioned if x\forall x, condition number is small; is ill conditioned if x\exists x, condition number is large

Connection to Approximation error y=f(x)y^=f(x^)y = f(x) \Rightarrow \hat y = f(\hat x), x^x\hat x - x will have different conditioning with x^xx\frac{\hat x- x}{x}, often we measuring the relative errors

Example f(x):=x,x0f(x):= \sqrt x, x\geq 0
x0.cxf(x)f(x)=x2xx=1/2\forall x \geq 0. c\approx\frac{xf'(x)}{f(x)} = \frac{x}{2\sqrt x\sqrt x} = 1/2\Rightarrow very well-conditioned

Example f(x)=exf(x)=e^x

cxexex=xc\approx \frac{xe^x}{e^x} =x\Rightarrow less and less well-conditioned when xx gets big. However, exe^x will overflows or underflows if x|x| is large, even before it gets ill-conditioned.

Example f(x)=sin(x)f(x)=\sin(x)

cxcos(x)sin(x)=xcot(x)c\approx \frac{x\cos(x)}{\sin(x)} = x\cot(x)\Rightarrow ill-conditioned near kπ,k0k\pi, k\neq 0 or x|x| is big and cos(x)0\cos(x)\neq 0

t = np.arange(-5., 5., 0.01)
plt.figure(figsize=(12, 4))
plt.plot(t, t * np.cos(t) / np.sin(t))
plt.title("conditioning number of sin(x)")
plt.savefig("assets/approx_error_1.jpg")

png

t = np.arange(-100., 100., 0.2)
plt.figure(figsize=(12, 4))
plt.plot(t, t * np.cos(t) / np.sin(t))
plt.title("conditioning number of sin(x)")
plt.savefig("assets/approx_error_2.jpg")

png

Stability of Algorithms

An algorithm is stable if small change to the input results in small change in the output

stable ≠ good

Consider f^(x):=0vs.f(x):=sin(x)\hat f(x) := 0\: vs. f(x):= \sin(x), it is very stable, while it does not accomplish a good approximation as wanted.