Lazy Lists

A common data structure that incorporates laziness is a lazy list (a.k.a. stream). Having worked through our Lazy library and LazyNats in detail, our discussion of streams here will be brief, mainly focusing on picking the right representation. We'll start by working on the following function:

range : Int -> Int -> Lazy.Action ??? (LazyList Int)

First Attempt

One possibility for representing LazyLists is the following type.

type LazyList a
  = Nil
  | Cons (Lazy.Thunk a) (LazyList a)

This datatype describes lists that are not very lazy, however. Although the values inside the list are delayed, the entire list structure (i.e. the Cons cells) must be built immediately:

range : Int -> Int -> Lazy.Action Int (LazyList Int)
range i j =
  if i > j then
    Lazy.pure Nil
  else
    Lazy.lazy (\() -> Lazy.pure i) |> Lazy.andThen (\head ->
    range (i+1) j                  |> Lazy.andThen (\tail ->
      Lazy.pure (Cons head tail)
    ))

 

> NotSoLazyList.range 1 5 |> run
Lazy.lazy       : 0
Lazy.lazy       : 1
Lazy.lazy       : 2
Lazy.lazy       : 3
Lazy.lazy       : 4
Cons (Thunk { id = 0, thunk = <function> })
 (Cons (Thunk { id = 1, thunk = <function> })
   (Cons (Thunk { id = 2, thunk = <function> })
     (Cons (Thunk { id = 3, thunk = <function> })
       (Cons (Thunk { id = 4, thunk = <function> }) Nil))))
    : LazyList Int

Second Attempt

Another option is the following.

type LazyList a
  = Nil
  | Cons a (Lazy.Thunk (LazyList a))

This is pretty good, but notice that a non-Nil list must have its first value evaluated. Consider what the representation of a range of Ints looks like.

range : Int -> Int -> Lazy.Action (LazyList Int) (LazyList Int)
range i j =
  if i > j then
    Lazy.pure Nil
  else
    Lazy.map (Cons i) (Lazy.lazy (\() -> range (i+1) j))

 

> PrettyLazyList.range 1 5 |> run
Lazy.lazy       : 0
Cons 1 (Thunk { id = 0, thunk = <function> })
    : LazyList Int

Final Attempt — LazyList.elm

What we really want is for all elements in the list, including the first, to be delayed until needed. We can achieve this as follows.

type alias LazyList a
  = Lazy.Thunk (LazyListCell a)

type LazyListCell a
  = Nil
  | Cons a (LazyList a)

Thought Exercise: Why didn't we use a similar strategy in defining the the lazy Nats before?

Let's warm up with a few fun operations...

range    : Int -> Int -> Lazy.Action (LazyListCell Int) (LazyList Int)
toList   : LazyList a -> Lazy.Action (LazyListCell a) (List a)
infinite : a -> Lazy.Action (LazyListCell a) (LazyList a)

... and then implement some core operations from the book:

append   : LazyList a -> LazyList a -> Lazy.Action (LazyListCell a) (LazyList a)
take     : Int -> LazyList a -> Lazy.Action (LazyListCell a) (LazyList a)
drop     : Int -> LazyList a -> Lazy.Action (LazyListCell a) (LazyList a)
reverse  : LazyList a -> Lazy.Action (LazyListCell a) (LazyList a)

range

The range function is incremental: each recursive call is wrapped in a Thunk. Notice that along the then branch, a trivial suspension is created: Lazy.lazy (\() -> Lazy.pure Nil).

range : Int -> Int -> Lazy.Action (LazyListCell Int) (LazyList Int)
range i j =
  Lazy.lazy (\() ->
    if i > j then
      Lazy.pure Nil
    else
      Lazy.map (Cons i) (range (i+1) j)
  )

 

> range 1 5 |> run
Lazy.lazy       : 0
Thunk { id = 0, thunk = <function> }
    : LazyList Int

toList

Converting a stream to a List is monolithic. Thus, we will avoid using helper functions like Lazy.andThen in favor of tail-recursive functions that mention the caches.

toList : LazyList a -> Lazy.Action (LazyListCell a) (List a)
toList =
  let
    foo acc thunk cache =
      let (cell, newCache) = Lazy.force thunk cache in
      case cell of
        Nil ->
          (acc, newCache)
        Cons a thunkRest ->
          foo (a::acc) thunkRest newCache
  in
  Lazy.map List.reverse << (foo [])

Now we can force the incremental range function to do its work:

> range 1 5 |> andThen toList |> run
Lazy.lazy       : 0
Lazy.lazy       : 1
Lazy.force MISS : 0
Lazy.lazy       : 2
Lazy.force MISS : 1
Lazy.lazy       : 3
Lazy.force MISS : 2
Lazy.lazy       : 4
Lazy.force MISS : 3
Lazy.lazy       : 5
Lazy.force MISS : 4
Lazy.force MISS : 5
[1,2,3,4,5] : List Int

> range 1 100000 |> andThen toList |> run
...
...,99998,99999,100000]
    : List Int

infinite

Infinite streams are defined incrementally.

infinite : a -> Lazy.Action (LazyListCell a) (LazyList a)
infinite a =
  Lazy.lazy (\() -> Lazy.map (Cons a) (infinite a))

Not surprisingly, we don't have enough memory to represent an an entire infinite stream:

> infinite 9 |> run
Lazy.lazy       : 0
Thunk { id = 0, thunk = <function> }
    : LazyList number

> infinite 9 |> andThen toList |> run
... FATAL ERROR: ...  JavaScript heap out of memory

take

The take function is incremental.

Take 1

take : Int -> LazyList a -> Lazy.Action (LazyListCell a) (LazyList a)
take k thunk =
  Lazy.force thunk |> Lazy.andThen (\cell ->
    case (k, cell) of
      (0, _) ->
        Lazy.lazy (\() -> Lazy.pure Nil)
      (_, Nil) ->
        Lazy.lazy (\() -> Lazy.pure Nil)
      (_, Cons a thunkRest) ->
        Lazy.lazy (\() -> Lazy.map (Cons a) (take (k-1) thunkRest))
  )

But there is still some unnecessary work; take forces the input list even if no elements are taken:

> infinite 9 |> andThen (take 0) |> run
Lazy.lazy       : 0
Lazy.lazy       : 1
Lazy.force MISS : 0
Lazy.lazy       : 2
Thunk { id = 2, thunk = <function> }
    : LazyList number

Take 2

take : Int -> LazyList a -> Lazy.Action (LazyListCell a) (LazyList a)
take k thunk =
  if k <= 0 then
    Lazy.lazy (\() -> Lazy.pure Nil)
  else
    Lazy.force thunk |> Lazy.andThen (\cell ->
      case cell of
        Nil ->
          Lazy.lazy (\() -> Lazy.pure Nil)
        Cons a thunkRest ->
          Lazy.lazy (\() -> Lazy.map (Cons a) (take (k-1) thunkRest))
    )

This no longer forces the list when zero elements are taken...

> infinite 9 |> andThen (take 0) |> run
Lazy.lazy       : 0
Lazy.lazy       : 1
Thunk { id = 1, thunk = <function> }
    : LazyList number

... but it does even before the first element is really needed:

> infinite 9 |> andThen (take 1) |> run
Lazy.lazy       : 0
Lazy.lazy       : 1
Lazy.force MISS : 0
Lazy.lazy       : 2
Thunk { id = 2, thunk = <function> }
    : LazyList number

Final Take

take k thunk =
  Lazy.lazy (\() ->
    if k <= 0 then
      Lazy.pure Nil
    else
      Lazy.force thunk |> Lazy.andThen (\cell ->
        case cell of
          Nil ->
            Lazy.pure Nil
          Cons a thunkRest ->
            Lazy.map (Cons a) (take (k-1) thunkRest)
      )
  )

That's better:

> infinite 9 |> andThen (take 0) |> run
Lazy.lazy       : 0
Lazy.lazy       : 1
Thunk { id = 1, thunk = <function> }
    : LazyList number

> infinite 9 |> andThen (take 1) |> run
Lazy.lazy       : 0
Lazy.lazy       : 1
Thunk { id = 1, thunk = <function> }
    : LazyList number

> infinite 9 |> andThen (take 5) |> run
Lazy.lazy       : 0
Lazy.lazy       : 1
Thunk { id = 1, thunk = <function> }
    : LazyList number

> infinite 9 |> andThen (take 5) |> andThen toList |> run
...
Lazy.lazy       : 11
Lazy.force MISS : 9
Lazy.force MISS : 11
[9,9,9,9,9] : List number

drop

Initially, drop doesn't have to force the list. But as soon as it does, it makes all the recursive calls right away. That is, it's monolithic. Thus, we implement a tail-recursive helper function with explicit caches, rather than using helper functions like Lazy.andThen.

drop : Int -> LazyList a -> Lazy.Action (LazyListCell a) (LazyList a)
drop =
  let
    foo k thunk cache =
      if k <= 0 then
        (thunk, cache)
      else
        let (cell, newCache) = Lazy.force thunk cache in
        case cell of
          Nil ->
            -- reusing thunk, since it produces Nil
            (thunk, newCache)
          Cons a thunkRest ->
            foo (k-1) thunkRest newCache
  in
  \k thunk ->
    Lazy.lazy (\() ->
      foo k thunk |> Lazy.andThen Lazy.force
    )

For example:

> range 1 10 |> andThen (drop 5) |> andThen toList |> run
...
[6,7,8,9,10] : List Int

> range 1 100000 |> andThen (drop 99990) |> andThen toList |> run
...
[99991,99992,99993,99994,99995,99996,99997,99998,99999,100000]
    : List Int

> infinite 9 |> andThen (drop 10) |> andThen (take 10) |> andThen toList |> run
[9,9,9,9,9,9,9,9,9,9]
    : List number

append

Combining two streams using append is incremental.

append : LazyList a -> LazyList a -> Lazy.Action (LazyListCell a) (LazyList a)
append xsThunk ysThunk =
  Lazy.lazy (\() ->
    Lazy.force xsThunk |> Lazy.andThen (\cell ->
      case cell of
        Nil ->
          Lazy.force ysThunk
        Cons a xsThunkRest ->
          Lazy.map (Cons a) (append xsThunkRest ysThunk)
    )
)

reverse

Reversing a stream delays forcing the input list...

reverse : LazyList a -> Lazy.Action (LazyListCell a) (LazyList a)
reverse =
  let
    foo : LazyListCell a -> LazyList a
       -> Lazy.Action (LazyListCell a) (LazyListCell a)
    foo accCell thunk cache =
      let (headCell, cache1) = Lazy.force thunk cache in
      case headCell of
        Nil ->
          (accCell, cache)
        Cons a thunkRest ->
          let
            (newAccTail, cache2) =
              Lazy.lazy (\() -> Lazy.pure accCell) cache1

            (_, cache3) =
              -- TODO: force trivial thunk...
              Lazy.force newAccTail cache2
          in
          foo (Cons a newAccTail) thunkRest cache3
  in
  \thunk ->
    Lazy.lazy (\() -> foo Nil thunk)

... but once it is forced, the recursion is monolithic:

> range 1 5 |> andThen reverse |> andThen (take 1) |> run
Lazy.lazy       : 0
Lazy.lazy       : 1
Lazy.lazy       : 2
Thunk { id = 2, thunk = <function> }
    : LazyList Int

> range 1 5 |> andThen reverse |> andThen (take 1) |> andThen toList |> run
...
Lazy.force MISS : 1
Lazy.lazy       : 13
Lazy.force MISS : 2
Lazy.force MISS : 13
[5] : List Int


Reading

Required

  • Okasaki, Chapter 4.2