Macro trial balloons

I've been hacking on and off on s1, my vaporware Awk-like tool that uses Scheme for scripting. The "language" defined by s1 includes a few macros. Strictly speaking, these are redundant, but they do make things shorter, which is important for writing one-liners.

Anyway, I am still kind of new to macros... so I'm wondering if the code I wrote is sound. I'm offering the definitions here for review. (Comments from experienced Schemers are welcome.)

1. inc! is used to easily increase variables. It can take any number of arguments; if none are given, 1 is added by default. For example:

  • (inc! x) adds 1 to x
  • (inc! x 33) adds 33 to x
  • (inc! x 1 2 3 4 5) adds (+ 1 2 3 4 5) to x

As the exclamation mark indicates, inc! changes the variable in-place. Here's its current definition:

(define-macro (inc! name . args)
  (let ((total (gensym)))
    `(let ((,total
             (if (null? (list ,@args))
                 1
                 (apply + (list ,@args)))))
       (set! ,name (+ ,name ,total)))))

2. Defining multiple variables does not mix well with one-liners. So I added a macro def that allows one to define them quickly and concisely. Like inc!, def takes any number of arguments. If an argument is a list (name value), then a variable is created with the given name and value. If an argument is a symbol, then a variable is created with that name and a value of 0. (Awk is often used to add numbers, so zero seems the most sensible default, IMHO.)

Examples:

  • (def x) -- same as (define x 0)
  • (def x y) -- same as (define x 0) (define y 0)
  • (def (a 1) (b "hello")) -- same as (define a 1) (define b "hello")
  • (def q (w 3)) -- same as (define q 0) (define w 3)

Here's the current definition:

(define-macro (def . args)
  (if (null? args)
      #f
      (let* ((a (car args))
             (rest-args (cdr args))
             (name (if (list? a) (car a) a))
             (value (if (list? a) (cadr a) 0)))
        `(begin
           (define ,name ,value)
           (def ,@rest-args)))))

(I'm not sure about the #f; it's not supposed to return a value anyway.)

In any case, s1's auxiliary functions and macros allow for concise code. (Some of it is sloppy, but useful for "scripting", especially one-liners. Naturally, it's always possible to write longer scripts using "cleaner" code.)

For example, here's a one-liner that takes the last words on the given lines and adds them up (assuming they are numbers):

s1 '(B (def s)) (inc! s &$nf) (A (out s))'

(I'm not sure about the & syntax yet; it's used here as a shortcut for the as-number function, which attempts to convert a string to a number, returning 0 by default.)

B and A are shorthands for BEFORE and AFTER, blocks that are executed before and after the main code (which is executed for each line in the given text). The actual order in which these appear doesn't matter, but it's probably more intuitive to do before-main-after.

Print the number of lines, words and characters (like wc):

s1 '(B (def c w)) (inc! w nf) (inc! c (len $0) 1) (A (out nl w c))'

Print names starting with "Ga":

s1 '(if (~ #/^Ga/) (print $0))' /usr/share/dict/propernames

(I'm using regex literals, and ~ is the same as string-search, except it matches against $0 (the whole line) by default.)

These are just teasers. Actual code is subject to change. I will release this when the "API" is somewhat stable. It's still mostly a toy, though... :-)

:: Comments (4)

Making s1 code shorter

More efforts to make s1 code shorter...

I added/changed the following:

  • slice function (this will come in handy for string manipulation, although (slice s a b) is still longer than s[a:b]
  • nf variable (indicates the number of fields in a line)
  • $ now accepts any expression, not just a number literal (so we can say $nf or $(- x 1) or whatever)
  • -f command line option to quickly set the field separator
  • created an alias s1 for 'csi -ss /path/to/s1.scm' (OK, this is not a change to the tool proper, but it helps... nobody wants to type csi -ss with the full path, all the time)

None of this is particularly original (as it's all borrowed from Awk and Python), but it does help to make code much shorter. I can now write a one-liner like this:

find "${1-.}" -type d | s1 -f/ '(print "   |" (make-string nf #\-) $nf)'

...which produces almost the same tree as described here. Almost, because it uses one "-" for each directory rather than two. We can fix that by making the example somewhat longer (and less readable to non-Schemers, although admittedly the Awk example isn't very readable to non-Awkers either):

find "${1-.}" -type d | s1 -f/ '(print "   |" (make-string (* (- nf 1) 2) #\-) $nf)'

Next up: regular expressions...

:: Comments

I can has scripting?

Brainstorming a bit here...

So I saw this cool one-liner script. It displays a directory tree:

ls -R | grep ":$" | sed -e 's/:$//' -e 's/[^-][^\/]*\//--/g' -e 's/^/   /' -e 's/-/|/'

I have been hacking (on and off) on an Awk-like tool that uses Chicken for scripting. Its working title is s1, and so far it supports begin/main/end blocks like Awk, and $N-notation for quick access to fields.

The real problem is not, adding Awk-like functionality as Scheme functions, but rather, making the Scheme concise enough that it can be used for one-liners. For example, the following works, but is still too verbose:

ls -l | csi -ss s1.scm '(before (set! total 0)) (set! total (+ total (as-number $5))) (after (print total))'

(It shows the total size of files in the current directory, in case it wasn't obvious. :-)

It gets even worse when we try to emulate the ls/grep/sed script mentioned above. Writing this as a one-line just isn't feasible at this point; rather, I have to use a script:

;; tree.s1
;; Call with: ls -aR | csi -ss s1.scm -s tree.s1

(define (branch parts)
  (let* ((parts (reverse parts))
         (dirname (string-trim-right (first parts) #\:))
         (line (apply conc (map (lambda (s) "--") (cdr parts)))))
    (conc "|" line dirname)))

(when (string-suffix? ":" $0)
  (let* ((parts (string-split $0 "/")))
    (print "   " (branch parts))))

Much of the work goes into processing strings and lists, but even without that the code probably still would not fit on one line.

Granted, it's hard to beat the conciseness of regular expressions (as used by grep and sed), but if I want s1 to be useful, it needs to support (much) shorter code. (Maybe I should rewrite it in Arc? ;-)

Here are some thoughts about making this shorter:

1. I dearly miss slicing. In Python, I would have used s[:-1] to chop off the trailing colon. Adding a slice function to the toolbox would probably be useful.

2. I could leverage some of the Awk-like features. This Reddit comment does just that:

find "${1-.}" -type d | awk -F/ '{printf "  |%*s%s\n",(NF-1)*2,"",$NF}'

It takes advantage of the following features:

  • using "/" as a field separator (effectively splitting the path with no extra code)
  • using NF (indicating the number of fields in a line) to compute the indentation level
  • using $NF to display the last field (i.e. the directory name we want to see)

As it happens, s1 can do all these things as well, but it won't be quite as concise. This is something to work on. Maybe there could be a special variable (or symbol) that translates to (length *fields*). I would have to hack the current $-literal to accept any expression, not just a number literal.

The tricky part is to make things concise while still keeping the spirit of Scheme. :-/ s1 will never automagically coerce numbers to strings (or vice versa), for example. I can, however, make it easy to convert them explicitly. (This is also the Pythoneer in me speaking... explicit is better than implicit, and all that good stuff.)

Anyway, there was really no point to this post, except letting the world know that I am tinkering with a new toy. :-) If it ever actually becomes useful, I'll release it... Until then, I might talk more about it here.

:: Comments (3)