Skip to content

Amortized Analysis

Amortized Complexity

Let σ=(σi)1m\sigma = (\sigma_i)_1^m be a sequence of operations on data structure DD, and tit_i be the time to execute σi\sigma_i. Let SS be the set of all valid sequence of operations on DD so that σS\sigma\in S.

Sequence complexity C(m)=max{1mti:σS}C(m) = \max\{\sum_1^m t_i: \sigma \in S\}.
Amortized complexity A(m)=C(m)mA(m) = \frac{C(m)}{m} time each operation takes in average.

Aggregate Analysis

In aggregate analysis, we show that for all nn, a sequence of nn operations takes worst-case time T(n)T(n) in total. In the worst case, the average cost, or amortized cost, per operation is therefore T(n)/nT(n)/n.

Accounting Method

We assign differing charges to different operations, with some operations charged more or less than they actually cost. The amount charged on each operation is called amortized cost. When an operation's amortized cost exceeds its actual cost, the differnt is assigned to specific objects in the data structure as credit.

In the worst case analysis, we have to make such that the the amortized cost provides an upper bound on the actual cost. Which mean that for all sequences of nn operations. nc^inci\sum^n \hat c_i \geq \sum^n c_i where c^i\hat c_i is the amortized cost for ith operation, cic_i is the actual cost. Then, the total credit stored is nc^inci0\sum^n \hat c_i - \sum^n c_i \geq 0 all the time.

[+] Potential Method

As an extension to accounting method, we represents the prepaid work (credit in accounting method) as "potential". We associate the potential with the data structure instead of specific objects within the data structure.

Let the state of an initial data structure as D0D_0, and DiD_i be the state of the data structre after applying the ith operation to Di1D_{i-1}. Define a potential function

Φ:{D0,...,Dn1}R\Phi: \{D_0, ..., D_{n-1}\}\rightarrow \mathbb R

to be the potential associated with data structure DiD_i. Then, the amortized cost c^i\hat c_i of the ith operation w.r.t. potential function Φ\Phi is defined by

c^i=ci+Φ(Di)Φ(Di1)\hat c_i = c_i + \Phi(D_i) - \Phi(D_{i-1})

and the total amortized cost of the n operations is

i=1nc^i=i=1n(ci+Φ(Di)Φ(Di1))=i=1nci+Φ(Dn)Φ(D0)\sum_{i=1}^n \hat c_i = \sum_{i=1}^n (c_i + \Phi(D_i) - \Phi(D_{i-1})) = \sum_{i=1}^n c_i + \Phi(D_n) - \Phi(D_0)

Example: Binary Counter

Consider a binary counter increment, which keeps a number xmod  2kx\mod 2^k be increament. The data structure used is a sequence A:=(b0,...,bk1),bi{0,1}.x=i=0k1ai2iA:=(b_0, ..., b_{k-1}), b_i \in \{0, 1\}. x = \sum_{i=0}^{k-1} a_i 2^i.

increment(A)
1
2
3
4
5
6
7
j = 0
while j < A.k and A[j] == 1:
    A[j] = 0
    j += 1
if j == k:
    A[j] = 0
A[j] = 1

Aggregate Analysis

Consider the number of bits flipped in each increment be the actual cost, we know that A[i]A[i] flipped every 2i2^i time. Thus, let NjN_j be the number of jth bit flips. we have the total cost

C(m)=j=0k1Nj=j=0k1m2imj=0k12i=2mC(m) = \sum_{j=0}^{k-1} N_j = \sum_{j=0}^{k-1} \lfloor \frac{m}{2_i}\rfloor \leq m\sum_{j=0}^{k-1} 2^{-i} = 2m
A(m)=C(m)/m=2O(1)A(m) = C(m)/m = 2\in O(1)

Accounting Method

charge an amortized cost of 2 dollars to set a bit to 11. When a bit is set we use 1 dollar, and the other 1 dollar as credit to be used later when we flip the bit back to 0. Therefore, every 1 in the counter has a dollar of credit, so that we charge nothing to reset a bit to 00. Since there will always be 1 in AA, the amount of credits is always positive. Thus, the total amortized cost is C(n)O(n)C(n)\in O(n) and A(n)O(1)A(n)\in O(1)

[+] Potential Method

Suppose that ith operation set tit_i bits from 1 to 0. Then citi+1c_i \leq t_i + 1.
Define Φ(Ai):=j=0k1I(bj=1)\Phi(A_i) := \sum_{j=0}^{k-1}\mathbb I(b_j = 1), i.e. the number of 1's in AiA_i. Then, Φ(Ai)Φ(Ai1)ti+1\Phi(A_i) - \Phi(A_{i-1}) \leq -t_i + 1 since at least tit_i number of 1's in Ai1A_{i-1} so that they can be reset by ith operation, and at most one 0-bit can be flipped to 11 in each operation. Therefore

c^i=ci+Φ(Ai)Φ(Ai1)ti+1ti+1=2\hat c_i = c_i +\Phi(A_i) - \Phi(A_{i-1}) \leq t_i + 1 -t_i + 1 = 2

We know that Φ(D0)=0,Φ(Dn)k\Phi(D_0) = 0, \Phi(D_n) \leq k so that

C(n)=2nkO(n),A(n)2O(1)C(n) = 2n -k \in O(n), A(n) \leq 2 \in O(1)

Example: Dynamic Table

Instead of using linked list in chaining implementation, we instead us a dynamic sized table. For each bin, we store a pointer to an array arr, a number size to be the number of elements stored, and maxsize to be the array length. Then, insert is implemented as

insert(T, x)
1
2
3
4
5
6
7
if T.size == T.maxsize:
    allocate new_arr of size 2 * T.maxsize
    T.maxsize *= 2
    copy T.arr into new_arr # T.size ops
    T.arr = new_arr
T.arr[T.size] = x # 1 op
T.size += 1

Let d(T):=size/maxsized(T) := \text{size}/\text{maxsize} be the load factor, assuming that we start with maxsize=1\text{maxsize} = 1. After each insert, we have that 0.5<d(T)10.5 < d(T) \leq 1 since we only double maxsize when size == maxsize.

[+] Potential Method

Define Φ(Ti)=2×\Phi(T_i) = 2 \times number of occupied slot in the second half of arr.

Φ(Ti)=2(sizemaxsize/2)=2sizemaxsize\Phi(T_i) = 2 (\text{size} - \text{maxsize}/2) = 2\text{size} - \text{maxsize}

Then, we have two cases to consider

If Ti.size=Ti.maxsizeT_i.\text{size} = T_i.\text{maxsize}, then we have the cost

ci=Ti.size+1c_i = T_i.\text{size} + 1

Also we know that Ti1.size=Ti.size1,Ti1.maxsize=Ti.maxsize/2T_{i-1}.\text{size} = T_i.\text{size} - 1, T_{i-1}.\text{maxsize} = T_i.\text{maxsize} / 2. Thus

ϕiϕi1=2Ti.sizeTi.maxsize2(Ti.size1)+Ti.maxsize/2=2Ti.maxsize/2\begin{align*} \phi_i - \phi_{i-1} &= 2T_i.\text{size} - T_i.\text{maxsize} - 2(T_i.\text{size} - 1) + T_i.\text{maxsize} / 2\\ &= 2 - T_i.\text{maxsize}/2 \end{align*}

From the if condition, we also have that $ T_i.\text{maxsize}/2 = T_i.\text{size} - 1$

ci+ϕiϕi1=Ti.size+1+2Ti.size+1=4O(1)c_i + \phi_i - \phi_{i-1} = T_i.\text{size} + 1 + 2 - T_i.\text{size} + 1 = 4\in O(1)

If Ti.size<Ti.maxsizeT_i.\text{size} < T_i.\text{maxsize}, then we have the cost

ci+ϕiϕi1=1+2=3O(1)c_i + \phi_i - \phi_{i-1} = 1 + 2 = 3 \in O(1)

Accounting Method

For each insert, we charge 33 dollars. 1 dollar for setting xx, and 2 dollar as credit. When the if branch runs, for each index ii, it pays for the cost of copying itself and the slot iT.maxsize/4i - T.\text{maxsize} / 4. Consider one iteration before the if branch runs. Known that T.size == T.maxsize then the second half have 2 credits since they have never been copied. Thus, they have enough credit to pay for the cost of the first half.

Aggregate Analysis

After nn insert, the maxsize of the table will be 2lgn2^{\lceil \lg n\rceil}, thus we have doubled the table size lgn1\lceil \lg n\rceil - 1 times. Each time copies 20,21,...,2lgn12^0, 2^1, ..., 2^{\lceil \lg n\rceil - 1} elements. Thus, the total costs will be

C(n)=n+i=0lgn12i=n+2lgn1n+2lgn+1=3nC(n) = n + \sum_{i=0}^{\lceil \lg n\rceil - 1} 2^i = n + 2^{\lceil \lg n\rceil} - 1 \leq n + 2^{\lg n + 1} = 3n