Lazy Lists

(Video part one and part two )

A common data structure that incorporates laziness is a lazy list (a.k.a. stream). (In the absence of lazy mechanisms, we will continue to assume that our Thunks are cached and evaluated at most once.) Our discussion of streams here will be brief, mainly focusing on picking the right representation. We'll start by working on the following function:

range : Int -> Int -> LazyList Int

First Attempt

One possibility for representing LazyLists is the following type.

type LazyList a
  = Nil
  | Cons (Thunk a) (LazyList a)

This datatype describes lists that are not very lazy, however. Although the values inside the list are delayed, the entire list structure (i.e. the Cons cells) must be built immediately:

range : Int -> Int -> LazyList Int
range i j =
  if i > j then
    Nil
  else
    Cons (\() -> i) (range (i+1) j)

 

> NotSoLazyList.range 1 5
Cons <function>
  (Cons <function>
    (Cons <function>
      (Cons <function>
        (Cons <function> Nil))))
    : NotSoLazyList.LazyList Int

Second Attempt

Another option is the following.

type LazyList a
  = Nil
  | Cons a (Thunk (LazyList a))

This is pretty good, but notice that a non-Nil list must have its first value evaluated. Consider what the representation of a range of Ints looks like.

range : Int -> Int -> LazyList Int
range i j =
  if i > j then
    Nil
  else
    Cons i (\() -> range (i+1) j)

 

> PrettyLazyList.range 1 5
Cons 1 <function> : PrettyLazyList.LazyList Int

Final Attempt — LazyList.elm

What we really want is for all elements in the list, including the first, to be delayed until needed. We can achieve this as follows.

type alias LazyList a
  = Thunk (LazyListCell a)

type LazyListCell a
  = Nil
  | Cons a (LazyList a)

Thought Exercise: Why didn't we use a similar strategy in defining the the lazy Nats before?

Let's warm up with a few fun operations...

range    : Int -> Int -> LazyList Int
toList   : LazyList a -> List a
infinite : a -> LazyList a

... and then implement some core operations from the book:

append   : LazyList a -> LazyList a -> LazyList a
take     : Int -> LazyList a -> LazyList a
drop     : Int -> LazyList a -> LazyList a
reverse  : LazyList a -> LazyList a

range

The range function is incremental: each recursive call is wrapped in a Thunk. Notice that along the then branch, a trivial suspension is created: \() -> Nil.

range : Int -> Int -> Lazy.Action (LazyListCell Int) (LazyList Int)
range i j =
  if i > j then
    \() -> Nil
  else
    \() -> Cons i (range (i+1) j)

 

> LazyList.range 1 5
<function> : LazyList.LazyList Int

toList

Converting a stream to a List is monolithic. Thus, we use tail-recursion to avoid busting the stack.

toList : LazyList a -> List a
toList =
  let
    foo acc thunk =
      case force thunk of
        Nil ->
          acc
        Cons a thunkRest ->
          foo (a::acc) thunkRest
  in
    List.reverse << foo []

Now we can force the incremental range function to do its work:

> LazyList.range 1 5 |> toList
[1,2,3,4,5] : List Int

> range 1 100000 |> toList
...,99998,99999,100000]
    : List Int

infinite

Infinite streams are defined incrementally.

infinite : a -> LazyList a
infinite a =
  \() -> Cons a (infinite a)

Not surprisingly, we don't have enough memory to represent an an entire infinite stream:

> infinite 9 |> run
Lazy.lazy       : 0
<function> : LazyList number

> infinite 9 |> toList
... FATAL ERROR: ...  JavaScript heap out of memory ...

take

The take function is incremental.

Take 1

take : Int -> LazyList a -> LazyList a
take k thunk =
  case (k, force thunk) of
    (0, _) ->
      \() -> Nil
    (_, Nil) ->
      \() -> Nil
    (_, Cons a thunkRest) ->
      \() -> Cons a (take (k-1) thunkRest)

But there is still some unnecessary work; take forces the input list even if no elements are taken.

Take 2

take : Int -> LazyList a -> LazyList a
take k thunk =
  if k <= 0 then
    \() -> Nil
  else
    case force thunk of
      Nil ->
        \() -> Nil
      Cons a thunkRest ->
        \() -> Cons a (take (k-1) thunkRest)

This no longer forces the list when zero elements are taken, but it does even before the first element is really needed.

Final Take

take : Int -> LazyList a -> LazyList a
take k thunk =
  \() ->
    if k <= 0 then
      Nil
    else
      case force thunk of
        Nil ->
          Nil
        Cons a thunkRest ->
          Cons a (take (k-1) thunkRest)

 

> infinite 9 |> take 5
<function> : LazyList number

> infinite 9 |> take 5 |> toList
[9,9,9,9,9] : List number

drop

Initially, drop doesn't have to force the list. But as soon as it does, it makes all the recursive calls right away. That is, it's monolithic. Thus, we implement a tail-recursive helper function.

drop : Int -> LazyList a -> LazyList a
drop =
  let
    foo k thunk =
      if k <= 0 then
        thunk
      else
        case force thunk of
          Nil ->
            thunk -- reusing thunk, since it produces Nil
          Cons a thunkRest ->
            foo (k-1) thunkRest
  in
  \k thunk ->
    \() -> force (foo k thunk)

For example:

> range 1 10 |> drop 5 |> toList
[6,7,8,9,10] : List Int

> range 1 100000 |> drop 99990 |> toList
[99991,99992,99993,99994,99995,99996,99997,99998,99999,100000]
    : List Int

> infinite 9 |> drop 10 |> take 10 |> toList
[9,9,9,9,9,9,9,9,9,9]
    : List number

append

Combining two streams using append is incremental.

append : LazyList a -> LazyList a -> LazyList a
append xsThunk ysThunk =
  \() ->
    case force xsThunk of
      Nil ->
        force ysThunk
      Cons a xsThunkRest ->
        Cons a (append xsThunkRest ysThunk)

reverse

Reversing a stream is monolithic:

reverse : LazyList a -> LazyList a
reverse =
  let
    foo acc thunk =
      case force thunk of
        Nil ->
          acc
        Cons a thunkRest ->
          foo (\() -> Cons a acc) thunkRest
  in
  \thunk ->
    \() -> force (foo (\() -> Nil) thunk)

Notice that (\_ -> Cons a acc) above is another example of a trivial thunk. The values a and acc have already been evaluated, so forcing the Cons cell does not force any additional computations.

> range 1 0 |> reverse |> toList
[] : List Int

> range 1 10 |> reverse |> toList
... JavaScript heap out of memory ...

> range 1 0 |> reverse |> toList
... JavaScript heap out of memory ...

Hmm, I'm not sure why reverse isn't working. The analogous lazylist.ml implementation in OCaml works as intended:

% ocaml
# #use "lazylist.ml" ;;
# range 1 10 |> reverse ;;
- : int lazylist = Thunk <fun>
# range 1 10 |> reverse |> toList ;;
- : int list = [10; 9; 8; 7; 6; 5; 4; 3; 2; 1]
# range 1 10000000 |> reverse |> toList ;;
- : int list = [10000000; 9999999; ...


Reading

Required

  • Okasaki, Chapter 4.2

Optional