Defining custom literals in Chicken Scheme
In a previous post, I briefly pondered what a Scheme-based Awk-like tool would look like. Awk has a concise syntax to access fields; e.g. $1 is the first field in a line, etc. A similar tool written in Scheme would benefit from having such syntax as well.
So, I wondered how much work it would be to add it to Chicken. Let's say, something that maps $1 to (field 1) (assuming a function called field exists, of course). As it turns out, it's not much work at all, not even for someone who has never hacked the Scheme reader (that would be me :-).
First of all, we need to look at the set-read-syntax function, which is helpfully built into Chicken. It is used like this:
(set-read-syntax! <character>
(lambda (port)
...read characters...
...return custom value...
...where <character> is the first character of the new literal. The lambda that follows can then do custom reading from port, resulting in data that can be manipulated at will. In this case, I want the literal to start with $, then read digits, and stop as soon as a non-digit is encountered. So my code should look something like this:
(set-read-syntax! #\$
(lambda (port)
(let* ((s (read-number port))
(i (string->number s)))
(field i))))
Except that Chicken doesn't have a read-number function. Fortunately, it's not hard to write one. Here's a version using read-token. (read-token reads a character at a time and tests it against a predicate, collecting characters that match the predicate, stopping as soon as one doesn't match, and returning the collected characters as a string.)
(define (number-char? c) (member c '(#\0 #\1 #\2 #\3 #\4 #\5 #\6 #\7 #\8 #\9))) (define (read-number port) (read-token number-char? port))
Now let's test it with a dummy implementation of field.
(define *fields* '())
(define (field n)
(cond
((< n 1) "")
((> n (length *fields*)) "")
(else (list-ref *fields* (- n 1)))))
(set! *fields* (string-split "the quick brown fox jumps over the lazy dog"))
(printf "~a ~a ~a~n" $1 $9 $5)
...prints "the dog jumps". :-)
(And yeah, my code isn't perfect, but it's just for demonstration purposes.)
Much like Ruby's monkeypatching, defining custom literals is probably not something that should be used in libraries a lot, but it looks like it could be very useful in DSLs or tools like the one mentioned.