Arc-macros, baby. Arc-macros.

OK, this is not really about macros, it’s really a reader hack, but bear with me… :-)

Let’s implement Arc’s “negation” operator (~) in Chicken. It’s only a few lines of code.

(define (negate pred)
  (lambda (x)
    (not (pred x))))

(set-read-syntax! #\~
  (lambda (port)
    (let ((expr (read port)))
      (list 'negate expr))))

That’s all. Test test… (Code below uses SRFI-1 for filter.)

> (use srfi-1)
; loading library srfi-1 ...
> (filter ~odd? '(1 2 3 4 5 6 7 8))
(2 4 6 8)
> (filter ~(lambda (x) (> x 2)) '(0 1 2 3 4 5 6))
(0 1 2)

This probably falls in the category “don’t try this at home”. But if you’re coming from languages where this just isn’t possible, this kind of stuff is really cool.

Note #1: you can just use (negate odd?) and such, which is slightly longer but probably cleaner. And that kind of thing would work in Python as well. (Although function composition isn’t all that popular in Python-land.)

Note #2: This implementation just reads the expression that follows ~, then feeds it to negate. So using a lambda works as well (but kind of defeats the purpose, which is conciseness).

By the way, the reader extension I showed a few days ago (to add Awk-like $N behavior) works, but I found out that it’s not really how you’re supposed to write it. At the point something like $3 is read, we’re *reading* rather than *evaluating*, so it should really expand to something that returns the right result *when evaluated later*. So better code would be:

(set-read-syntax! #\$
  (lambda (port)
    (let* ((s (read-number port))
           (i (string->number s)))
      (list 'field i))))

In other words:

  • when the reader encounters $3, it expands it to the form (field 3) (but *does not* evaluate it at that point)
  • later on, we evaluate (field 3) and get the correct result, using whatever is in *fields* at the time of evaluation

So yeah, maybe this is like macros after all, but I don’t know the correct term. :-)

There’s another post like this coming up, using Arc as an excuse to tinker with the Scheme reader.

Update (2008-02-04): Here is similar code in Python, sort of. The additional “syntax” (which is actually, creating a class that defines “~”, and then wrapping functions in it) seems to be more trouble than it’s worth. If you do this kind of thing a lot in Python (probably not, but you never know), you might be better off using a simple negate function, e.g.

def negate(f):
    return lambda *args, **kwargs: not f(*args, **kwargs)

(I normally would not have used lambda, but after more than a month of Scheming, it’s hard not to. :-)

:: Comments (3)

Defining custom literals in Chicken Scheme

In a previous post, I briefly pondered what a Scheme-based Awk-like tool would look like. Awk has a concise syntax to access fields; e.g. $1 is the first field in a line, etc. A similar tool written in Scheme would benefit from having such syntax as well.

So, I wondered how much work it would be to add it to Chicken. Let’s say, something that maps $1 to (field 1) (assuming a function called field exists, of course). As it turns out, it’s not much work at all, not even for someone who has never hacked the Scheme reader (that would be me :-).

First of all, we need to look at the set-read-syntax function, which is helpfully built into Chicken. It is used like this:

(set-read-syntax! <character>
  (lambda (port)
    ...read characters...
    ...return custom value...

…where <character> is the first character of the new literal. The lambda that follows can then do custom reading from port, resulting in data that can be manipulated at will. In this case, I want the literal to start with $, then read digits, and stop as soon as a non-digit is encountered. So my code should look something like this:

(set-read-syntax! #\$
  (lambda (port)
    (let* ((s (read-number port))
           (i (string->number s)))
      (field i))))

Except that Chicken doesn’t have a read-number function. Fortunately, it’s not hard to write one. Here’s a version using read-token. (read-token reads a character at a time and tests it against a predicate, collecting characters that match the predicate, stopping as soon as one doesn’t match, and returning the collected characters as a string.)

(define (number-char? c)
  (member c '(#\0 #\1 #\2 #\3 #\4 #\5 #\6 #\7 #\8 #\9)))

(define (read-number port)
  (read-token number-char? port))

Now let’s test it with a dummy implementation of field.

(define *fields* '())

(define (field n)
  (cond
    ((< n 1) "")
    ((> n (length *fields*)) "")
    (else (list-ref *fields* (- n 1)))))

(set! *fields* (string-split "the quick brown fox jumps over the lazy dog"))

(printf "~a ~a ~a~n" $1 $9 $5)

…prints “the dog jumps”. :-)

(And yeah, my code isn’t perfect, but it’s just for demonstration purposes.)

Much like Ruby’s monkeypatching, defining custom literals is probably not something that should be used in libraries a lot, but it looks like it could be very useful in DSLs or tools like the one mentioned.

:: Comments (1)

Simple unit testing with Chicken Scheme

I’m currently translating a small toy project of mine to Chicken Scheme. Since the original (written in Python) has unit tests, it made sense to look for similar functionality for Chicken. As it turns out, there are several testing frameworks available. Since right now I’m still exploring, I settled for a relatively simple one, the test egg (by Alex Shinn).

Unlike Python’s unittest, the test library is not object-oriented at all. Rather, it defines a small number of useful functions and macros, which can be used at will (e.g. nested, inside lets and loops, etc).

The most important interface is (test <expected-value> <expression>), which, naturally, checks if expression evaluates to expected-value. For example, (test 4 (+ 2 2)) would be a valid use of test. (Note that switching the arguments does *not* work; (test (+ 2 2) 4) is illegal.)

Here’s some actual code:

(use test)
(load "tools")

(test-group "make-address"
  (test #xC000 (make-address #xC0 #x00))
  (test #x1234 (make-address #x12 #x34)))

(test-group "high"
  (test #x08 (high #x0801))
  (test #xC0 (high #xC000))
  (test #x12 (high #x1234)))

(test-group "low"
  (test #x01 (low #x0801))
  (test #x00 (low #xC000))
  (test #x34 (low #x1234)))

(test-group "signed->unsigned"
  (define s->u signed->unsigned)
  (test 0 (s->u 0))
  (test 20 (s->u 20))
  (test #xFF (s->u -1))
  (test #xF7 (s->u -9))
  (test #x80 (s->u -128))
  (test-error (s->u 200)))

(test-group "unsigned->signed"
  (define u->s unsigned->signed)
  (test 0 (u->s 0))
  (test 20 (u->s 20))
  (test -128 (u->s #x80))
  (test -1 (u->s #xFF))
  (test -9 (u->s #xF7))
  (test-error (u->s -1)))

The test-group form is a simple but effective way to group tests together; they show up in separate sections in test reports. It has its own lexical scope, so anything defined inside it is not visible outside of it.

When an expression is expected to produce an error, use test-error to catch this and test for it. To simply assert that an expression is non-false, use test-assert.

Of course, tests can be used inside loops, which makes it really easy to test each item in a list. Each of these tests will show up as a separate entry in the resulting test report. (Something which isn’t so easy to accomplish with Python’s unittest.) Some more actual code (which tests all items in a list, and defines a meaningful name for each test):

(test-group "*opcodes*"
  (define (test-opcode opcode)
    (define test-name (sprintf "test opcode: ~s" opcode))
    (test-assert test-name (member (opcode-type opcode) *opcode-types*)))
  (for-each test-opcode *opcodes*))

When running these scripts, test reports are produced that look like this:

-- testing make-address ------------------------------------------------------
(make-address 192 0) ................................................. [ PASS]
(make-address 18 52) ................................................. [ PASS]
2 tests completed in 0.002 seconds.
2 out of 2 (100%) tests passed.
-- done testing make-address -------------------------------------------------

-- testing high --------------------------------------------------------------
(high 2049) .......................................................... [ PASS]
(high 49152) ......................................................... [ PASS]
(high 4660) .......................................................... [ PASS]
3 tests completed in 0 seconds.
3 out of 3 (100%) tests passed.
-- done testing high ---------------------------------------------------------

…etc.

The test egg works for my current purposes; writing tests is easy, and so is grouping them, or running several files of tests. (Just create a file test-all.scm that loads all the other test files.) However, in the future I might want to look for a testing framework that is a little more configurable. For example, as far as I can tell, it’s impossible to make small changes to the test report output, without rewriting the whole function that handles it (and registering it as the current-test-group-reporter). But I’ll stick with it for now — it’s a great way to do real testing (with surprising flexibility) with little work.

Update (2008-01-24): I found it somewhat annoying that, if you have a lot of tests in separate groups, you have to scroll up in the terminal window to see if any failed, because totals are only displayed for each group, but not for all groups as a whole. However, this is easily fixed with a small shell script:

csi -s test-all.scm | grep FAIL

shows quickly if there were any failures. It’s not perfect (e.g. it doesn’t show which groups the failed tests were in), but it helps.

:: Comments (3)

Scripting with Chicken

Today a cool script was posted to the chicken-users mailing list. Not only does it illustrate how to make an egg, it also demonstrates how to use Chicken Scheme as a scripting language.

The eggify function is somewhat long, but still relatively easy to read; the second let* part is long, but not complicated, as it can be read line by line. It almost reads like line-for-line statements, e.g.

# pseudo-translation to Python
metafile = metafiles[0]
metadata = handle_exceptions(...)
egg(metadata['egg'][0])
...etc...

And the actual body of it is just three “statements” (yeah, yeah, I know Scheme doesn’t really have statements), dealing with the file system and issuing commands (like tar).

This is of course nothing special for languages that are often regarded as “scripting languages” — like Perl, Python and Ruby — but it’s unusual to see it in Scheme. I’m delighted that it’s possible. This is one of the reasons I like Chicken — it tends to be more “real world” and “pragmatic” than some other implementations.

:: Comments off

Cool listcomp, revisited

About two weeks ago I looked at a list comprehension in Haskell and Python. Since then I have also looked at eager comprehensions in Scheme (using SRFI-42). So, I just thought I’d be interesting to write that same listcomp in Scheme. Here it is…

(use syntax-case)
(use srfi-42)

(define (f n)
  (list-ec (: x 1 (+ n 1))
           (: y (+ x 1) (+ n 1))
           (: z (+ y 1) (+ n 1))
           (if (= (+ (* x x) (* y y))
                  (* z z)))
           (list x y z)))

(print (f 20))

From a readability point of view, this code might benefit from something like

  (let ((m (+ n 1)))
      (list-ec (: x 1 m)
               (: y (+ x 1) m)
               (: z (+ y 1) m)
               ...

as the ranges used by eager comprehensions don’t include the upper bound, much like Python’s range function. (I.e. (: x 10) includes the numbers 0 through 9, but not 10.)

:: Comments off

Python vs Scheme: Simple list comprehensions

Python has a nifty feature called list comprehensions. Originally borrowed from Haskell, this is a (mostly) compact way to generate lists, without having to resort to the map and filter functions (which are often considered less clear and tend to rely on Python’s underpowered lambda).

Scheme, being a functional language, has an entirely different attitude toward map, filter and friends. Their use is encouraged, rather than being downplayed. In spite of that, there are situations where list comprehensions (or some form thereof) would be useful. So, SRFI-42 introduces “eager comprehensions”.

Rather than providing one construct, SRFI-42 has a number of forms, some of which seem to exist for efficiency reasons. For now, I’m just going to look at list-ec, which (in its most basic form) is a close cousin of Python’s list comprehensions.

Let’s run a simple example in Chicken:

> (use syntax-case)
; loading /usr/local/lib/chicken/3/syntax-case.so ...
; loading /usr/local/lib/chicken/3/syntax-case-chicken-macros.scm ...
> (use srfi-42)
; loading /usr/local/lib/chicken/3/srfi-42.scm ...
; loading /usr/local/lib/chicken/3/srfi-42-support.so ...

> (list-ec (: i 5) (* i i))
(0 1 4 9 16)

(Note: srfi-42 is an egg that needs to be installed first. It depends on another egg, syntax-case. Make sure these are installed, and imported in the correct order, or the examples won’t work!)

In Python, this would look like:

>>> [i*i for i in range(5)]
[0, 1, 4, 9, 16]

In this case, we could easily write this with a map… if Scheme had something similar to Python’s range function, which it doesn’t. In fact, the lack of such a construct seems to have been a major motivation for writing SRFI-42. The author states:

“The origin of this SRFI is my frustration that there is no simple [form] for the list of integers from 0 to n-1. With this SRFI it is (list-ec (: i n) i).”

Moreover, much like in Python, eager comprehensions are capable of more powerful expressions, that are not so easily written using map. Like this one, which isn’t terribly complex:

> (list-ec (: n 1 4) (: i n) (list n i))
((1 0) (2 0) (2 1) (3 0) (3 1) (3 2))

# Python:
>>> [(n,i) for n in range(1, 4) for i in range(n)]
[(1, 0), (2, 0), (2, 1), (3, 0), (3, 1), (3, 2)]

Much, much more is possible using SRFI-42, most of which I don’t understand yet :-), so for now I’m sticking to these simple examples. There will probably be a part II to this post, at some point. However, as a teaser, here’s a way to write Python’s enumerate (that works on strings and lists and probably some other types):

> (define (enumerate seq)
>   (list-ec (: x (index i) seq) (list i x)))

> (enumerate "hello")
((0 #\h) (1 #\e) (2 #\l) (3 #\l) (4 #\o))
> (enumerate '(guido larry matz))
((0 guido) (1 larry) (2 matz))

:: Comments (1)

Eggs, eggs, eggs

Eggs are official Chicken extensions. As you can see in the Eggs Unlimited repository, there are lots of them, for all kinds of purposes: extensions to the language, object systems, testing frameworks, graphics, web, XML, and much more. (As such, the repository is not unlike Perl’s CPAN, or RubyGems, or Python’s Cheese Shop.)

Eggs are Chicken-specific; while some other Schemes have their own packages, they are not compatible with eggs.

These extensions are (usually) easily installed, using the chicken-setup script. You can find the full documentation here, but the short story is as follows.

Look up the egg you want in Eggs Unlimited. Let’s say you don’t like the fact that Chicken doesn’t support the full numerical tower out of the box. So you want to install the numbers egg. Doing so is as easy as

$ chicken-setup numbers

or, if you need admin privileges, maybe

$ sudo chicken-setup numbers

(which is what I use on my Mac).

This will look for the numbers egg, download it, build it, and install it. Note that, like most package managers, chicken-setup will figure out any dependent eggs, and (if necessary) download and install them as well.

Some eggs require external libraries; for example, numbers wants to be compiled with -lgmp. Such libraries need to be installed first (e.g. via MacPorts on OS X).

In order to see which eggs have been installed, use the -l switch:

$ chicken-setup -l

I find the output a bit hard to read, because of the distance between the package name and the version. This one-liner might help:

$ chicken-setup -l | awk '{ print $1, $3 }'

This produces output that looks like this:

args 1.2
chicken-wrap 1.91
easyffi 1.91
environments 1.5
format 3.1.3
...etc...

To see detailed information about a specific egg, use the -l switch with a name, like:

$ chicken-setup -l numbers

…which will give you information about the files in the egg, the names that it exports, the release date, the version, dependencies, etc.

chicken-setup supports many more options (use the –help switch to see them), but these are among the most useful.

By the way, there’s a caveat that might not be a problem for most users unless they’re running on Windows without Cygwin:

“The mechanism for loading compiled extensions is based on dynamically loadable code and as such is only available on systems on which loading compiled code at runtime is supported. Currently these are most UNIX-compatible platforms that provide the libdl functionality like Linux, Solaris, BSD, Mac OS X and Windows using Cygwin.”

:: Comments off

Python vs Scheme: function parameters (part II)

In part I, I looked at the handling of function arguments as specified in R5RS. Chicken adds three more ways to handle special arguments, using the keywords #!optional, #!rest and #!key. (Here’s the relevant page in the Chicken documentation.)

The way I understand it, #!rest is just another way to collect optional arguments in a list. For our purposes, the following code is equivalent to (f a b . args):

> (define (r a b #!rest args)
     (list a b args))
> (r 1 2)
(1 2 ())
> (r 1 2 3)
(1 2 (3))

Optional arguments can also be specified separately (rather than collecting them all in a list). For this, use the #!optional keyword, followed by a list of names. Arguments passed to the function are associated with these names based on position. If not specified, the name defaults to #f.

> (define (o a #!optional b c)
     (list a b c))
> (o 1)
(1 #f #f)
> (o 2 'fred)
(2 fred #f)

;; roughly equivalent to Python:
;; def o(a, b=None, c=None): ...

We can also specify defaults:

> (define (p #!optional (a 42) (b 'cookie))
     (list a b))
> (p)
(42 cookie)
> (p 103)
(103 cookie)

;; roughly equivalent to Python:
;; def p(a=42, b="cookie"): ...

Again, values are associated with names based on position.

It would be useful if we could specify a parameter’s name. What if, in the above example, we want to override b but not a? In that case, we need to use #!key. This works much like #!optional, except that we can specify names (using #:name syntax). In fact, we *must* specify names, because positional arguments are ignored (unless we also specify #!optional or #!rest).

> (define (k #!key (a 42) (b 'cookie))
        (list a b))
> (k)
(42 cookie)
> (k 1)
(42 cookie)  ;; neither a nor b is 1
> (k #:b 'possum)
(42 possum)
> (k #:b 'possum #:a 99) ; order of args doesn't matter
(99 possum)
> (k b: 'possum) ; this is also allowed
(42 possum)

So, in order to set a value for a, we must use #a: value in the function call. value: is also allowed.

Note if a function specifies both #!key and #!rest, keyword arguments passed to the function will show up in the rest argument as well, no matter whether there’s a matching name in #!key or not.

> (define (z #!rest rest #!key (a 42) (b 'cookie))
     (list a b rest))
> (z)
(42 cookie ())
> (z 1 2)
(42 cookie (1 2))
> (z #:b 'soup 52)
(42 soup (b: soup 52))
> (z #:a 10 #:q 129)
(10 cookie (a: 10 q: 129))

In fact, using just #!rest, we can emulate Python’s **kwargs construct (sort of):

> (define (y #!rest r) r)
> (y #:a 3)
(a: 3)
> (y 1 2 3 #:foo 'bar)
(1 2 3 foo: bar)

(“Parsing” this list of rest args requires a bit of special code to collect the pairs; at this point, I’m not sure if Chicken provides such a function out of the box.)

By the way, we can pass the resulting list to apply without problems:

> (apply y '(1 2 3))
(1 2 3)
> (apply y '(1 2 3 foo: bar))
(1 2 3 foo: bar)
> (apply y '(1 2 3 #:foo bar))
(1 2 3 foo: bar)

What I like about the Chicken construct is that it doesn’t mix up positional and keyword arguments (unlike Python, although there’s a PEP to fix this in Python 3000).

Anyway, while all this is powerful, it’s probably generally a good idea not to make argument lists too complicated. (The same is true in Python, by the way.)

:: Comments off

Compiling Chicken code (part II)

One of the cool features of Chicken is that you may choose to compile parts of your Scheme program to C, in order to get better performance.

Let’s write a (naive) implementation of the factorial function. Store it in fac.scm:

;; fac.scm

(define (fac n)
(if (<= n 1)
1
(* n (fac (- n 1)))))

Now, write a second file main.scm that uses our fac function. Normally this would be as simple as

(load "fac")

(print (fac 30))

But I’d like to include some simple benchmark code, to see how well fac performs, interpreted vs compiled. I’m using the following code to call (fac 30) 100,000 times, and time the results.

;; main.scm

(load "fac")

(define *max-iterations* 100000)

(define t1 (current-milliseconds))
(let loop ((i 0))
(unless (= i *max-iterations*)
(fac 30)
(loop (+ i 1))))
(define t2 (current-milliseconds))

(define seconds (/ (- t2 t1) 1000))
(printf "~a iterations in ~a seconds.~n" *max-iterations* seconds)

Interpreted, it takes a little more than two seconds on my machine:

$ csi -s main.scm
100000 iterations in 2.191 seconds.

We can do better than that. Let’s compile fac.scm and see if it makes a difference.

$ csc -s fac.scm  # produces fac.so
$ csi -s main.scm
100000 iterations in 0.326 seconds.

That’s right — we still run main.scm interpreted, but it now uses the *compiled* fac.scm, which makes execution much faster. ^_^

We could do a lot more with this, e.g. compile main.scm to an executable, etc, but I just wanted to demonstrate how easy it is to precompile libraries, with no change in code. (At least, not for simple cases like this… but that’s a different story.)

:: Comments (2)

Compiling Chicken code (part I)

(This post is just a quick introduction to using the Chicken compiler and interpreter.)

Consider this simple file:

;; hello.scm

(print "Hello, world!")

Save this as hello.scm.

Running this as a script in the interpreter is done as follows:

$ csi -s hello.scm

Note that if the -s is omitted, the script will still run, after which the interactive interpreter is started. This can be useful in some situations (e.g. to do post-execution inspection).

Alternatively, we may choose to compile the script to an executable:

$ csc hello.scm

(csc has a lot of options, but the simplest case does what most people would expect, i.e. it compiles an the executable named hello.)

Lo and behold, it works:

$ ./hello
Hello, world!

On my system (using Chicken 2.732, Mac OS X 10.4, Intel Dual Core) this produces a 14K executable, which seems quite reasonable. However, the way I understand it, this file dynamically links to the Chicken libraries. To create a standalone executable, we must use the -static command line option, which produces a much bigger file:

$ csc -static hello.scm
$ ls -l hello
-rwxr-xr-x   1 zephyrfa  zephyrfa  1370732 Dec 30 13:54 hello*

By the way, I’m not entirely sure that this really produces an executable that is 100% standalone… I’ll have to do more testing.

There is much, much more to the Chicken compiler. Expect to see more about it in future posts. (Like the next one… :-)

:: Comments off

« Previous Page« Previous entries « Previous Page · Next Page » Next entries »Next Page »