Thursday, February 03, 2011

Lazy lists in Groovy

I like lazy evaluation, and it's one of the reasons I like Haskell language so much. Although from engineering perspective lazy evaluation is probably not the most needed feature, it's definitely very useful for solving some mathematical problems.

Most languages don't have lazy evaluation out of the box, but you can implement it using some other language features. This is an interesting task, and I use it as a code kata which I practice every time I learn a new strict language.

So, how to implement lazy lists in strict languages? Very simple, if the language has functional capabilities. Namely, you build lazy list recursively by wrapping strict list within a function. Here is, for example, the strict empty list in Groovy:

[]

If we wrap it with a closure, it becomes lazy empty list:

{-> [] }

If we need a list with one element, we prepend (or speaking Lisp terminology 'cons') an element to lazy empty list, and make the result lazy again:

{-> [ element, {-> [] } ] }

To add more elements we continue the same process until all elements are lazily consed. Here is, for example, a lazy list with three elements a, b and c:

{-> [a, {-> [b, {-> [ c, {-> [] } ] } ] } ] }

Now, when you have an idea how to build lazy lists, let's build them Groovy way. We start with creating a class:

class LazyList {
private Closure list

private LazyList(list) {
this.list = list
}
}

The variable list encapsulates the closure wrapper of the list. We just need to expose some methods that allow constructing lists using procedure described above:

    static LazyList nil() {
new LazyList( {-> []} )
}

LazyList cons(head) {
new LazyList( {-> [head, list]} )
}

Now we can construct lists by consing elements to empty list:

def lazylist = LazyList.nil().cons(4).cons(3).cons(2).cons(1)

To access elements of the list we implement two standard functions, car and cdr, that return head and tail of the list respectively.

    def car() {
def lst = list.call()
lst ? lst[0] : null
}

def cdr() {
def lst = list.call()
lst ? new LazyList(lst.tail()[0]) : nil()
}

Here is how you use these functions to get first and second elements of the list constructed above

assert lazylist.car() == 1
assert lazylist.cdr().car() == 2

In Lisp there are built-in functions for various car and cdr compositions. For example, the previous assertion would be equivalent to function cadr. Instead of implementing all possible permutations, let's use Groovy metaprogramming to achieve the same goal.

    def methodMissing(String name, args) {
def matcher = name =~ /c([ad])([ad]+)r/
if (matcher) {
matcher[0][2].reverse().toList().inject(this) {
del, index -> del."c${index}r"()
}."c${matcher[0][1]}r"()
} else {
throw new MissingMethodException(name, this.class, args)
}
}

It might look complicated, but in reality it's pretty simple if you are familiar with Groovy regex and functional programming. It's easier to explain by example. If we pass "caddr" as a value of name parameter, the method will create a chain on method calls .cdr().cdr().car() which will be applied to delegate of the operation which is our LazyList object.

With this method in place we can call car/cdr functions with arbitrary depth.

assert lazylist.caddr() == 3

If you create nested lazy lists, you can access any element of any nested list with this dynamic method.

def lmn = LazyList.nil().cons('N').cons('M').cons('L')
def almnz = LazyList.nil().cons('Z').cons(lmn).cons('A')
assert almnz.cadadr() == 'M'

With so many cons methods it's hard to see the structure of the list. Let's implement lazy method on ArrayList class that converts strict list to lazy. Again, we will use metaprogramming and functional techniques.

ArrayList.metaClass.lazy = {
-> delegate.reverse().inject(LazyList.nil()) {list, item -> list.cons(item)}
}

Now we can rewrite the previous example as follows

def lazyfied = ['A', ['L','M','N'].lazy(), 'Z'].lazy()
assert lazyfied.cadadr() == 'M'

What have we accomplished so far? We learned how to build lazy lists from scratch and from strict lists. We know how to add elements to lazy lists, and how to access them. The next step is to implement fold function. fold is the fundamental operation in functional languages, so our lazy lists must provide it.

    boolean isEmpty() {
list.call() == []
}

def fold(n, acc, f) {
n == 0 || isEmpty() ? acc : cdr().fold(n-1, f.call(acc, car()), f)
}

def foldAll(acc, f) {
isEmpty() ? acc : cdr().foldAll(f.call(acc, car()), f)
}

The only difference between this fold function and the standard one is the additional parameter n. We will need it later when we implement infinite lists. foldAll function to lazy lists is the same as standard fold to strict lists.

assert [1,2,3,4,5].lazy().foldAll(0){ acc, i -> acc + i } == 15
assert [1,2,3,4,5].lazy().fold(3, 1){ acc, i -> acc * i } == 6

First example calculates the sum of all elements of the list, second calculates the product of first three elements.

If you have fold functions you can easily implement take functions

    def take(n) {
fold(n, []) {acc, item -> acc << item}
}

def takeAll() {
foldAll([]) {acc, item -> acc << item}
}

def toList() {
takeAll()
}

take is an inverse operation to lazy

assert [1,2,3,4,5].lazy().takeAll() == [1,2,3,4,5]
assert [1,2,3,4,5].lazy().take(3) == [1,2,3]

Our next goal is map function on lazy lists. Ideally I want the implementation look like this

    def map(f) {
isEmpty() ? nil() : cdr().map(f).cons(f.call(car()))
}

For some reason it doesn't work lazy way in Groovy — it's still strictly evaluated. Therefore I have to implement it directly with closure syntax

    def map(f) {
isEmpty() ? nil() : new LazyList( {-> [f.call(car()), cdr().map(f).list]} )
}

Unlike fold, lazy map is identical to strict map

assert [1,2,3,4,5].lazy().map{ 2 * it }.take(3) == [2,4,6]

The following example shows one of the benefits of laziness

assert [1,2,3,0,6].lazy().map{ 6 / it }.take(3) == [6,3,2]

map didn't evaluate the entire list, hence there was no exception. If you evaluate expression for all elements, the exception will be thrown

try {
[1,2,3,0,6].lazy().map{ 6 / it }.takeAll()
}
catch (Exception e) {
assert e instanceof ArithmeticException
}

For strict lists this is a default behaviour of map function.

The last function I want to implement is filter

    def filter(p) {
isEmpty() ? nil() :
p.call(car()) ? new LazyList( {-> [car(), cdr().filter(p).list]} ) :
cdr().filter(p)
}

In the following example we find first two elements greater than 2

assert [1,2,3,4,5].lazy().filter{ 2 < it }.take(2) == [3,4]

With the help of car/cdr, fold, map and filter you can implement any other function on lazy lists yourself. Here is, for example, the implementation of zipWith function

    static def zipWith(alist, blist, f) {
alist.isEmpty() || blist.isEmpty() ? nil() :
new LazyList( {-> [
f.call(alist.car(), blist.car()),
zipWith(alist.cdr(), blist.cdr(), f).list
]} )
}

Now, after we implemented all lazy functions we need, let's define infinite lists

    private static sequence(int n) {
{-> [n, sequence(n+1)]}
}

static LazyList integers(int n) {
new LazyList(sequence(n))
}

static LazyList naturals() {
integers(1)
}

Infinite lists, from my point of view, is the most useful application of lazy lists

def naturals = LazyList.naturals()
assert naturals.take(3) == [1,2,3]

def evens = naturals.map { 2 * it }
assert evens.take(3) == [2,4,6]

def odds = naturals.filter { it % 2 == 1 }
assert odds.take(3) == [1,3,5]

assert naturals.cadddddddddr() == 10

def nonnegatives = naturals.cons(0)
assert nonnegatives.cadr() == 1

assert LazyList.zipWith(evens, odds){ x, y -> x * y }.take(4) == [2,12,30,56]

At this point you have all basic functionality implemented, and you should be able to extend this model to whatever you need in regards to lazy (infinite) lists. Happy lazy programming!

Resources and links

• Source code for this blog

• Lazy list implementation in Erlang

• Lazy list implementation in Lisp

No comments: