For simplicity, we'll continue to work with heaps of Int
s. Our implementation (BinomialHeaps.elm
) exports the same type signatures as the previous implementations of min-heaps.
type alias Rank = Int
type Tree = Node Rank Int (List Tree)
rank (Node r _ _) = r
root (Node _ x _) = x
Binomial trees of rank r
are defined inductively as follows.
Node 0 n []
is a binomial tree of rank 0
.Node r n ts
is a binomial tree of rank r > 0
if ts
is a list of r
binomial trees with rank r-1
through 0
, respectively.A binomial tree of rank r
has 2r
nodes.
A binomial tree of rank r + 1
is formed by linking together two trees of rank r
, making one the leftmost child of the other.
The link
function below links two binomial trees, choosing to keep the smaller of the two elements at the root. Therefore, if t1
and t2
are both heap-ordered, then so is the result.
link : Tree -> Tree -> Tree
link t1 t2 =
let (Node r x1 c1) = t1
(Node _ x2 c2) = t2
in
if x1 <= x2
then Node (1+r) x1 (t2::c1)
else Node (1+r) x2 (t1::c2)
type alias InternalHeap = List Tree
type Heap = WrapHeap InternalHeap
A binomial heap is list of heap-ordered binomial trees, kept in strictly-increasing order of rank. A binomial heap containing n elements is represented using at most O(log n) binomial trees, analogous to the binary representation of n.
empty : Heap
empty = WrapHeap []
isEmpty : Heap -> Bool
isEmpty h = h == empty
The findMin
function searches for the smallest root among all of the binomial trees, taking O(log n) time.
findMin (WrapHeap ts) =
case List.map root ts of
[] -> Nothing
n::ns -> Just (List.foldl min n ns)
See Homework 3 for a way to implement findMin
so that it runs in O(1) time, as for other heap implementations.
Inserting into a binomial heap requires pairwise link
ing of Trees
with equal rank. Think "sum" and "carry" bits as in arithmetic addition. This is analogous to arithmetic addition.
insert : Int -> Heap -> Heap
insert x (WrapHeap ts) = WrapHeap (insertTree (Node 0 x []) ts)
insertTree : Tree -> InternalHeap -> InternalHeap
insertTree t ts = case ts of
[] -> [t]
t'::ts' ->
if rank t == rank t' then insertTree (link t t') ts'
else if rank t < rank t' then t :: ts
else Debug.crash "insertTree: impossible"
There are O(m) recursive calls to insertTree
(where m is the length of ts
), each of which performs O(1) work (to link
two trees). Thus, insert
runs in O(m) = O(log n) time.
merge : Heap -> Heap -> Heap
merge (WrapHeap ts1) (WrapHeap ts2) = WrapHeap (merge_ ts1 ts2)
merge_ : InternalHeap -> InternalHeap -> InternalHeap
merge_ ts1 ts2 = case (ts1, ts2) of
([], _) -> ts2
(_, []) -> ts1
(t1::ts1', t2::ts2') ->
if rank t1 < rank t2 then t1 :: merge_ ts1' ts2
else if rank t2 < rank t1 then t2 :: merge_ ts2' ts1
else insertTree (link t1 t2) (merge_ ts1' ts2')
To analyze the running time of merge_
, let m be the total number of trees in both ts1
and ts2
. The first two cases run in O(1) time. Each recursive call to merge_
decreases m by one (in the third and fourth cases) or two (in the fifth case). The cons operations in the third and fourth cases require O(1) work. In the fifth case, link
requires O(1) time, insertTree
requires O(m) time, and the recursive call to merge_
requires T(m-2) time. There are O(m) recursive calls, each of which requires at most O(m) time. The result is a O(m2) running time, which is O(log2(n)) where n is the total number of elements described in the two heaps being merged.
A more subtle analysis, however, can be used to argue that the implementation of merge_
runs in O(log n) time. The argument requires a more careful accounting of how many times link
is called (which is the crux of both insertTree
and merge_
) based on the analogy between merging lists and adding two numbers in binary representation.
For our purposes, we will consider an alternative, one-pass definition (also drawn from this post) that is slightly easier to analyze. (UPDATE 2/9: New r == r1 == r2
case for merge_wc
.)
merge' : InternalHeap -> InternalHeap -> InternalHeap
merge' ts1 ts2 = case (ts1, ts2) of
([], _) -> ts2
(_, []) -> ts1
(t1::ts1', t2::ts2') ->
if rank t1 < rank t2 then t1 :: merge' ts1' ts2
else if rank t2 < rank t1 then t2 :: merge' ts1 ts2'
else merge_wc (link t1 t2) ts1' ts2'
merge_wc : Tree -> InternalHeap -> InternalHeap -> InternalHeap
merge_wc t ts1 ts2 = case (ts1, ts2) of
([], _) -> insertTree t ts2
(_, []) -> insertTree t ts1
(t1::ts1', t2::ts2') ->
let (r,r1,r2) = (rank t, rank t1, rank t2) in
if r < r1 && r < r2 then t :: merge' ts1 ts2
else if r < r1 && r == r2 then merge_wc (link t t2) ts1 ts2'
else if r == r1 && r < r2 then merge_wc (link t t1) ts1' ts2
else if r == r1 && r == r2 then t :: merge_wc (link t1 t2) ts1' ts2'
else Debug.crash "merge_wc: impossible"
Let T(m) and S(m) be the running times of merge'
and merge_wc
, respectively, where m is an upper bound on the number of trees in both input lists combined. Consider each of the five cases of merge'
:
Consider each of the six cases of merge_wc
:
There are at most O(m) mutually recursive calls between the two functions. The last call to merge_wc
may take O(m) time, but all other calls take O(1) time. Thus, the worst-case running time for each of these two functions is O(m)+O(m) = O(m) time. That is, O(log n) time.
removeMinTree : InternalHeap -> (Tree, InternalHeap)
removeMinTree ts = case ts of
[t] -> (t, [])
t::ts' ->
let (t',ts'') = removeMinTree ts' in
if root t < root t'
then (t, ts')
else (t', t::ts'')
deleteMin : Heap -> Maybe Heap
deleteMin (WrapHeap ts) = case ts of
[] -> Nothing
_ -> let (Node _ x ts1, ts2) = removeMinTree ts in
Just (WrapHeap (merge_ (List.reverse ts1) ts2))
We can reuse the removeMinTree
helper function to reimplement findMin
.
findMin (WrapHeap ts) =
case ts of
[] -> Nothing
_ -> Just (root (fst (removeMinTree ts)))
Let's revisit the merge
function above and think a bit more carefully about why the output heap satisfies the intended invariants.
First, assume we have implemented a couple of predicates that identify valid binomial trees and binomial heaps, respectively:
binTree : Tree -> Bool
binHeap : Internal Heap -> Bool
Next, we can define the following function to compute the rank of the first binomial tree in a binomial heap, if any:
rankFst : InternalHeap -> Int
rankFst ts =
case ts of
[] -> Random.maxInt -- pretend this is "infinity"
t::_ -> rank t
Now let's reason about the functions on which merge
depends, namely, link
and insertTree
.
In comments, we'll write some pre-conditions on arguments that we expect all callers to satisfy and some post-conditions on the return value that our implementation intends to satisfy.
link : Tree -> Tree -> Tree
-- link : (t1:Tree) -> (t2:Tree) -> (ret:Tree)
--
-- @pre: binTree t1 && binTree t2 && rank t1 == rank t2
-- @post: binTree ret && rank ret == 1 + rank t1
A first attempt for insertTree
:
insertTree : Tree -> InternalHeap -> InternalHeap
-- insertTree : (t:Tree) -> (ts:InternalHeap) -> (ret:InternalHeap)
--
-- @pre: binTree t && binHeap ts && rank t <= rankFst ts
-- @post: binHeap ret
And a first attempt for merge_
:
merge_ : InternalHeap -> InternalHeap -> InternalHeap
-- merge_ : (ts1:InternalHeap) -> (ts2:InternalHeap) -> (ret:InternalHeap)
--
-- @pre: binHeap ts1 && binHeap ts2
-- @post: binHeap ret
In the case where t1
is cons-ed onto the recursive call to merge_
, the binHeap ret
predicate follows from (i) the post-condition established by the recursive call, (ii) the conditional check, and (iii) from the relationship between t1
and ts1'
established by the assumption binHeap ts1
. The case where t2
is cons-ed onto the recursive call is similar.
Now consider the case when insertTree
is called with the return value from the recursive call to merge_
. The post-condition of merge_
above is not sufficient to satisfy the pre-conditions required by insertTree
. So, we need to establish a stronger post-condition:
merge_ : InternalHeap -> InternalHeap -> InternalHeap
-- merge_ : (ts1:InternalHeap) -> (ts2:InternalHeap) -> (ret:InternalHeap)
--
-- @pre: binHeap ts1 && binHeap ts2
-- @post: binHeap ret &&
-- rankFst ret >= min (rankFst ts1) (rankFst ts2) (added)
This will satisfy the pre-conditions on insertTree
, but now the post-condition of insertTree
is not sufficient to satisfy the new, stronger post-condition of merge_
. So, we need to strengthen the post-condition of insertTree
:
insertTree : Tree -> InternalHeap -> InternalHeap
-- insertTree : (t:Tree) -> (ts:InternalHeap) -> (ret:InternalHeap)
--
-- @pre: binTree t && binHeap ts && rank t <= rankFst ts
-- @post: binHeap ret &&
-- rankFst ret >= rank t (added)
This is the kind of informal reasoning we need to do on paper to argue why merge
does indeed produce a valid binomial heap.
To gain more confidence, we could add defensive, dynamic checks at run-time to make sure that we haven't missed something. Or we could use powerful theorem provers to make certain that we've written the code correctly. Both of these are topics for another day.