dollop grows continuations…

Re: dollop… Continuations now work… or at least a subset of them. This works, for example:

> (+ 1 (call/cc (lambda (k) (+ 3 (k 2)))))
=> 3

Continuations are hairy like Fuzzyman. ;-) So if you don’t understand the example above, don’t worry, they take a while to wrap your head around. Someday I’ll write a “continuations in 60 seconds” post (or maybe a bit more time will be needed :-); until then, this explanation might help.

:: Comments off

Heist

Yesterday I stumbled upon this: Heist, a Scheme interpreter in Ruby.

It looks pretty good; it’s certainly more advanced than my attempt at a Scheme interpreter in Python (which is, in turn, more advanced than Psyche, and will be released eventually, once I backport some of the dollop code to it). Apparently it supports (among other things) hygienic macros and continuations, features that haven’t yet made it into my interpreter.

On a side note, it’s interesting to see how different the Ruby code looks compared to Python code. Five years ago I would have said (and did say :-) that Python and Ruby are much more alike than unalike. They still are, but they surely encourage (and discourage) different programming practices and idioms.

:: Comments (2)

A dollop of Lisp

I’ve written a few (toy) Lisp interpreters over the years. Who hasn’t right? :-) At the core of such interpreters, there’s the eval-apply cycle. It basically works like this:

  • To evaluate an expression, call eval on it.
  • If the expression is atomic, this is just a matter of a simple name lookup, or the expression evaluates to itself (in the case of literals).
  • If the expression is composite (i.e. a list), it is considered to be a function call, where the first element of the list is a function, and the other elements are the arguments passed to that function. We evaluate all of these elements (again by calling eval on them), then call apply to do the actual function application.
  • If said function is built-in, we just call it, get a result back, and be done. If it’s a user-defined function, then we need to evaluate the expressions in that function’s body, using the procedure described here.

This is a simplified version — it doesn’t take into account special forms or macros, for example — but it does capture the essence of the eval-apply cycle.

Anyway, since expressions can be nested arbitrarily deep, it should be clear that eval and apply can and often do call themselves and each other.

Now, when writing a Scheme interpreter in a language like Python, it is tempting to use a straightforward translation of this principle. You write functions (or methods, if you wish) eval() and apply(), which may call themselves and each other. This works rather well — until you blow the stack. The implementation language’s stack, that is… because every call to eval or apply in the Scheme interpreter, also uses the implementation language’s call stack.

This is a problem, especially in Scheme, which relies on tail recursion to define “iterative” processes. (These still use recursion, but because of tail call optimization, they operate in constant stack space. See SICP 1.2.1.)

We can simulate tail recursion by using exceptions (as demonstrated by this Cookbook recipe, for example). This is the approach I followed in most of my projects. While it does work, sooner or later you’ll run into other problems, because the interpreter is still piggybacking on Python’s call stack, rather than having one of its own. So, implementing continuations, for example, becomes very difficult, as the notion of “the calling expression” is implicit.

So, I decided to use a different approach. And now we’re finally getting to the point of this blog post. :-)

This time I started with a call stack. The expression we want to evaluate is pushed onto it. For example,

(+ 1 a)

Then, I wrote a method run() which does one “evaluation step” using this call stack. If it’s done evaluating, it returns the result value, otherwise it returns None, as an indication that there’s still more work to do. (So, in order to evaluate an expression, you call it multiple times until you get a result back.)

How does such an evaluation step work? We take the topmost expression on the call stack, and start evaluating it. In this case, it’s a composite expression, so we just start at the beginning. We want to evaluate the symbol +. To do so, we push + onto the stack, and put a placeholder in its original position, indicated by “$$”.

;; call stack now looks like:
($$ 1 a) +

We’re now done with this step. In the next step, we again look at the topmost expression on the stack. It’s a symbol; we look it up, get a function back (represented by <lambda:+>, and stick it back into the original expression. We then move on to the next element of the list, which is 1, and push it.

(<lambda:+> $$ a) 1

Another step, and we get:

(<lambda:+> 1 $$) a

Let’s say the variable a has been defined and its value is 4. We then get:

(<lambda:+> 1 4)

Now we’re done evaluating the elements of this expression, and we can do the actual function application. Finally, we get the result 5.

This gets more complex when we use nested expressions, but the principle is the same.

(+ (* a (- c d)))
($$ (* a (- c d))) +
(<lambda:+> $$) (* a (- c d))
(<lambda:+> $$) ($$ a (- c d)) *
(<lambda:+> $$) (<lambda:*> $$ (- c d)) a
(<lambda:+> $$) (<lambda:*> 4 $$) (- c d)
;; ...etc...

It becomes more interesting when we introduce special forms into the mix. But since we have an explicit call stack now, we can clearly see what is going on with it, and tail recursion starts making much more sense. For example, if our topmost expression is

(if condition (do-this) (do-that))

and during evaluation we determine that condition evaluates to true, we can at that point replace the whole expression by

(do-this)

rather than having to push (do-this) and then sticking the result back into the if expression. Ditto for the last expression inside a begin block, and so on.

The special forms require a bit of hackery, since we don’t evaluate all of their arguments… but there’s only a few of them.

The proof-of-concept implementation that uses this concept is called dollop and is available at github. (Requires Python 3.0.) The name is because it’s only a “dollop of Lisp” (or rather, Scheme); it only supports a few special forms (begin, define, if, lambda), and a few functions for example programs (+, -, *, =, list). It cuts corners in other ways as well, as my goal was to get a working proof-of-concept out, not to write a complete Scheme interpreter.

Anyway, comments, bug fixes, improvements, etc, welcome.

(I will also release two other Lisp-y interpreters, but those will get blog posts and project pages of their own…)

:: Comments (3)

^_^

As seen in the list of registered voters for the R*RS Steering Committee:

Name: Hal Abelson
[...]

The voter indicated that he would like to submit the first chapter of Structure and Interpretation of Computer Programs as his statement. The Steering Committee agrees that this does fulfill the requirement, given that the voter did write it, and it is more that 75 words long. Unfortunately we don’t have the space to reproduce it here, but it can be found online at http://mitpress.mit.edu/sicp/.

(For non-Schemers: To register as a voter, people must submit a statement indicating their interest in Scheme and the standardization process. Most people’s submissions look like, “I have been using Scheme for X years blah blah blah…” I guess if you wrote SICP, you can afford to do it differently. :-)

:: Comments off

Random thought

If I finish my Scheme interpreter (written in Python), and it comes out halfway decent, it would technically be possible to use Google App Engine with Scheme…

*ponders*

:: Comments off

Interpreter, redux

This post is a followup on:

Let’s take another look at the way the built-in words were defined, in the interpreter. This may look straightforward, but I used to do this differently, even in Python.

In the OCaml code (which will be made available soon), all built-in words are functions with the same signature. They all take two parameters, an environment (basically an object that holds the stack and other interpreter state) and a namespace. These functions then use the environment to pop elements off the stack, etc., resulting in code that looks like this:

(* dup  ( x -- x x ) *)
let dup env ns =
  let obj = Env.top env in
  Env.push env obj;;

(* swap  ( a b -- b a ) *)
let swap env ns =
  let a, b = Env.pop2 env in
  Env.push env a;
  Env.push env b;;

(* -  ( n1 n2 -- n3 )
*)
let minus env ns =
  let top_stack = Env.top_stack env in
  let x1 = Stack.pop top_stack in
  let x2 = Stack.pop top_stack in
  match x1.obj, x2.obj with
    LzInt i1, LzInt i2 ->
      let result = i2 - i1 in
      let newobj = Env.create_object env (LzInt result) in
      Stack.push top_stack newobj
  | _ -> raise Invalid_arguments "-: requires two integers"
;;

(Note: This was my first non-trivial project in OCaml, written about two years ago… I’m sure the code can be improved, but bear with me.)

For a long time I thought this was a fairly straightforward way to do it… until I saw the source code of this Joy interpreter (by John Cowan). Written in Scheme, it puts macros to good use in order to make word definitions simpler and clearer. The words mentioned above can be defined like this:

(joy-prim-list (dup x) (list x x))
(joy-prim-list (swap x y) (list y x))
(joy-prim (- i j) (- i j))

Definitions like (dup x) and (swap x y) tell us how many items are popped off the stack, *and* they get named automatically, to be used in the following expression.

Python doesn’t have macros, but it has other ways to do something similar. The tiny interpreter I wrote about yesterday uses a form of introspection to determine the number of arguments a function has. Using this, we can then write built-in words like this:

words = {
    '+': lambda x, y: [x+y],
    'dup': lambda x: [x, x],
    'drop': lambda x: [],
    'swap': lambda x, y: [y, x],
    'print': _print,
}

In other words, the function associated with +, takes two items off the stack, adds them, and puts them back. The function associated with dup takes one item, pushes back two, etc. This allows for definitions that are much more concise and straightforward… at least in these cases.

I wonder if the same can be done in statically typed languages like OCaml (or Haskell)? I don’t see an obvious way to do it, but I’d love to be enlightened. :-)

:: Comments (2)

Cutting

I did some long overdue Chicken hackery yesterday, and by accident I found out that Chicken’s (or rather, SRFI-26’s) cut/cute macros are not the same as Arc’s [ ] syntax after all.

Scheme/Arc aficionados already knew this, of course, but to me it was news since I’ve never really used cut much. Here’s a short explanation.

Scheme’s cut produces an anonymous function that takes as many arguments as there are <> symbols at the “top level”. E.g.

> (cut + 1 <>)
#<procedure (? g3)>
; has one argument

> (cut + 1 <> <>)
#<procedure (? g4 g5)>
; has two arguments

; etc...

However, any <> that is found in a nested expression, is not considered as a parameter. Therefore:

;this works...
> (map (cut + 1 <>) '(1 2 3))
(2 3 4)

; but this does not:
> (map (cut + 1 (* 2 <>)) '(1 2 3))
Error: bad argument count - received 1 but expected 0: #<procedure (?)>

By contrast, Arc’s [ ] syntax produces an anonymous function that expects one and only one argument, which is represented by a single underscore. As a result, it’s simultaneously more and less limited than cut:

arc> (map [+ 1 _] '(1 2 3))
(2 3 4)
arc> (map [+ 1 (* 2 _)] '(1 2 3))
(3 5 7)
; no problem

; but two parameters doesn't work:
arc> (map [cons _ _] '(1 2 3) '(4 5 6))
Error: "#<procedure>: expects 1 argument, given 2: 1 4"

Of course, for anonymous functions that are more complex than this, it’s probably better to just use lambda… :-/ Or a named function.

:: Comments (2)

Interlude: Dude, where’s my Chicken?

This started out as a Chicken blog, where I would report my adventures with Chicken Scheme. But these days it’s more like a Python blog again. (Which isn’t too surprising, given my experience with the language, and the fact that I use it for work and most of my personal projects.)

Heck, it has even been added to Planet Python again, which wouldn’t have been the case if I wrote about Scheme all the time.

But really, I want to learn more about Scheme as well, *and* be able to use it in a pragmatic way. I just need a bit of focus. Right now I’m wondering what to look at next. Maybe object systems?

:: Comments off

Python vs Scheme: using files as both modules and programs

Python has a useful idiom, that allows one to use the same file as both a module and a program. Consider this simple example:

# foo.py

def bar(x):
    print "bar says:", x

if __name__ == "__main__":

    bar(42)

The if __name__ == “__main__” clause is only executed if foo.py is run as the “main program”. In other words, we can do both

$ python foo.py
bar says: 42

and

$ python
Python 2.5.1 (r251:54869, Apr 18 2007, 22:08:04)
[GCC 4.0.1 (Apple Computer, Inc. build 5367)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import foo
>>> foo.bar(33)
bar says: 33

Although admittedly a bit of a hack, this construct is well-known and often used.

Now, on to Chicken Scheme. Can we do the same? It required a bit of poking around in documentation and mailing list, but it appears the answer is yes.

;; foo.scm

(define (foo x)
  (print "foo says: " x))

(define (main args)
  (print "args: " args)
  (foo 42))

Now, we can run this as a script with csi -ss (which looks for a function called main and automagically calls it):

$ csi -ss foo.scm
args: ()
foo says: 42

Notice that the main function has an argument args, which contains the command line arguments passed to the program:

$ csi -ss foo.scm 1 2 3
args: (1 2 3)
foo says: 42

We can also import foo.scm from within an interactive session, in which cases main is not called:

$ csi

CHICKEN
Version 3.0.0 - macosx-unix-gnu-x86     [ manyargs dload ptables applyhook ]
(c)2000-2008 Felix L. Winkelmann        compiled 2008-03-05 on niflheim.local (Darwin)

; loading /Users/zephyrfalcon/.csirc ...
; loading /usr/local/lib/chicken/3/readline.so ...
#;1> (use foo)
; loading ./foo.scm ...
#;2> (foo 101)
foo says: 101

But, but! Isn’t Chicken primarily a compiler? Does the above work too when using csc rather than csi? Actually it does, but the invocation is different. I use the following:

$ csc foo.scm -postlude "(main (cdr (argv)))"

$ ./foo
args: ()
foo says: 42

$ ./foo 1 2 3
args: (1 2 3)
foo says: 42

The -postlude option can be used to specify code that runs when the executable is called. (Actually, the official explanation is: “Add EXPRESSIONS after all other toplevel expressions in the compiled file. This option may be given multiple times. Processing of this option takes place after processing of -epilogue.”)

I use (main (cdr (argv))) as the postlude expression, which seems to pass command line arguments the same way as csi -ss passes them to the main function (although there might be a catch that I’m not aware of). The cdr is necessary because the first item of the list returned by (argv) is the name of the calling program (e.g. “./foo”).

(If there’s a better way, please let me know, as my knowledge about the compiler is limited.)

Next up: parsing command line arguments in both Python and Scheme…

:: Comments (2)

Scheme crooks & nannies: the case form

You know R5RS’s case construct? It works much like case/switch statements in other languages, except it has more parentheses: ;-)

#;6> (define language 'chicken)
#;7> (case language
--->   ((c c++ java) "bleh")
--->   ((python perl ruby) "better")
--->   ((chicken scheme) "cool")
--->   (else "never heard of it"))
"cool"

Now, yesterday there was an interesting question in chicken-users. The following code works:

#;8> (define foo 'a)
#;9> (case foo
--->   ('a 3)
--->   (else 4))
3

Does that mean that Chicken’s case accepts qualifiers other than lists? Actually, no. Here’s a hint:

#;10> (define bar 'quote)
#;11> (case bar
---->   ('a 3)
---->   (else 4))
3

As it turns out, ‘a expands to (quote a), so it *is* a list, and therefore “correct”, in the sense that case treats it like a list containing the symbols quote and a. (Which is not what was intended, but it *is* valid.)

Too bad, the following does not work:

#;13> (define cool-names '(guido larry matz felix))
#;14> (case 'bill
---->   (cool-names "cool!")
---->   (else "uncool"))
Error: (map) during expansion of (case ...) -
 argument is not a proper list: cool-names

Inspecting a case construct with macroexpand-1 may give more insight in how case works exactly. I was going to leave this as an exercise to the reader, but thought better of it. :-) Here’s an example (formatting added by me):

#;1> (macroexpand-1 '(case a ((x y z) 5) (else 4)))
(let ((g2 a))
  (if (or (eqv? g2 (quote x))
          (eqv? g2 (quote y))
          (eqv? g2 (quote z)))
      (begin 5)
      (begin 4)))

:: Comments off

« Previous entries Next Page » Next Page »