Amortized Analysis

See Chapter 5.1 of the Okasaki textbook for a summary of two approaches to amortized analysis, the banker's method and the physicist's method. Here, we will say just a few extra words about the accounting for the FastQueue implementation from before.

Recall that when evaluating the running times of functions, we often use symbolic constants instead of concrete units (such as seconds) in order to abstract away from low-level, incidental, and variable characteristics of an execution environment. And because Big-O analysis, furthermore, abstracts away the particular constants of a function, symbolic constants (such as k) are often replaced by particular small constants (such as 1).

Once comfortable with such reasoning, we can often jump directly to saying things like "the amortized cost of enqueue is 2" and "the amortized cost of dequeue is 1". In the meantime, let us spend the extra effort to explicitly account for the actual (symbolic) costs of operations before reducing them to their asymptotic characteristics.

Actual Costs (ti)

To avoid confusion with the notation c for credits from the textbook, here we will use the metavariable k to range over constants. Let the following constants stand for actual costs of the various operations on a Queue that contains m elements in the back list:

  • k1 is the cost of enqueue;
  • k2 is the cost of peek;
  • k3 is the cost of dequeue when List.reverse is not called (the cheap case); and
  • k4m + k5 is the cost of dequeue when List.reverse is called (the expensive case).

Notice the actual cost of dequeue in the expensive case is linear in the size of the back list.

Amortized Costs (ai)

We now revisit the amortized analyses from the textbook using the symbolic costs above.

Banker's Method (ai = ti + ci - ci)

We allocate k4 credits for each element added to the back, where k4 was the cost for List.reverse to process one element.

  • enqueue: k1 + k4 - 0 = k1 + k4
  • peek: k2 + 0 - 0 = k2
  • dequeue (cheap): k3 + 0 - 0 = k3
  • dequeue (expensive): (k4m + k5) + 0 - k4m = k5

Thus, all of the operations run in O(1) amortized time.

Physicist's Method (ai = ti + Φ(di) - Φ(di-1))

We define the potential to be k4m where m is the length of the back.

  • enqueue: k1 + k4(m+1) - k4m = k1 + k4
  • peek: k2 + k4m - k4m = k2
  • dequeue (cheap): k3 + k4m - k4m = k3
  • dequeue (expensive): (k4m + k5) + k40 - k4m = k5

Thus, all of the operations run in O(1) amortized time.

Persistence

The amortized analysis, however, assumes that no Queue object is used as the input for more than one Queue operation. What if each version of the queue is persistent (as with a purely functional implementation) rather than ephemeral (destroyed with imperative, in-place updates)?

Consider the following example:

q_0      =  empty                  -- { front = [] , back = [] }
q_1      =  enqueue 1 q_0
q_2      =  enqueue 2 q_1
        ... 
q_n      =  enqueue n q_{n-1}
q_{n+1}  =  dequeue q_n            -- expensive, calls List.reverse
_        =  dequeue q_n            -- expensive, calls List.reverse

Using the FastQueue implementation, the operation dequeue q_n triggers a call to List.reverse, which runs in O(n) time; all other calls to dequeue run in O(1) time.

In the absence of persistence, the amortized analysis is able to reason that this expensive operation happens infrequently compared to the cheap ones and that any sequence of n FastQueue operations takes O(n) time overall.

But, the program is free to use the "old" version q_n of "the" queue elsewhere in the program; repeatedly dequeue-ing from it will break the reasoning that the expensive operation happens infrequently.

What to do? Be lazy and switch to an imperative language? Or be lazy?


Reading

Required

  • Okasaki, Chapter 5.1 and 5.6

Optional

  • Okasaki, Chapter 5.3 (BinomialHeap.insert runs in O(1) amortized time.)