Archive for January, 2008

Arc-macros, baby. Arc-macros. (part II)

More useless reader macros ahead. This time, let's implement Arc's [ ... _ ... ] lambda shortcut. In Chicken Scheme, of course. ^_^

I will be using { } rather than [ ]. Incidentally, Chicken already supports { } as an alternative for ( ):

> '{1 2 3}
1 2 3)
> {+ 1 2}
3

This reader macro will override that. Anyway, the code follows. It's clumsy (although not as clumsy as my first version), and there's probably a better/shorter way to do it, that I'm not aware of yet. In any case, it's naive; for example, this version does not allow nesting the { } constructs. Also, the replace-underscore function won't replace underscores in nested lists. 1) And so on. Remember that this is just for demonstration purposes before pointing out the zillions of flaws. :-)

(define (read-until-next-curly port)
  "clumsy way to read up until the first }."
  (read-token (lambda (c) (not (char=? c #\}))) port))

(define (replace-underscore exprlist)
  (define (replace-underscore-item y)
    (if (equal? y '_) 'x y))
  (map replace-underscore-item exprlist))

(define (make-lambda-form exprlist)
  `(lambda (x) (,@exprlist)))

(define (eval-string s)
  (with-input-from-string s read))

(define (transform-into-lambda s)
  (let* ((expr (eval-string s))
         (expr2 (replace-underscore expr)))
    (make-lambda-form expr2)))

(set-read-syntax! #\{
  (lambda (port)
    (let* ((s (read-until-next-curly port))
           (t (string-append "(" s ")")))
      (read-byte port) ; read trailing '}'
      (transform-into-lambda t))))

Testing...

(print { + _ 1 })     ; => #<procedure (? x)
(print ({+ _ 1} 44))  ; => 45

(print (map { * _ 2 } '(1 2 3 4 5 6)))
; => (2 4 6 8 10 12)

Cool. Of course, if you use a lot of small lambdas, then it's possible to write a macro that shortens things for you, rather than having to resort to syntax changes. For example:

> (define-macro (fn . body)
>   `(lambda (_) ,body))
> (fn + _ 1)
#<procedure (? _)>
> ((fn + _ 1) 44)
45

;; compare:
> ((lambda (x) (+ x 1)) 44)
45

Anyway, much like the previous post, this was just an excuse to explore reader macros a bit more. So don't shoot me...

1) On a side note, why doesn't SRFI-1 have a function to replace items in a list?

:: Comments (3)

Arc-macros, baby. Arc-macros.

OK, this is not really about macros, it's really a reader hack, but bear with me... :-)

Let's implement Arc's "negation" operator (~) in Chicken. It's only a few lines of code.

(define (negate pred)
  (lambda (x)
    (not (pred x))))

(set-read-syntax! #\~
  (lambda (port)
    (let ((expr (read port)))
      (list 'negate expr))))

That's all. Test test... (Code below uses SRFI-1 for filter.)

> (use srfi-1)
; loading library srfi-1 ...
> (filter ~odd? '(1 2 3 4 5 6 7 8))
(2 4 6 8)
> (filter ~(lambda (x) (> x 2)) '(0 1 2 3 4 5 6))
(0 1 2)

This probably falls in the category "don't try this at home". But if you're coming from languages where this just isn't possible, this kind of stuff is really cool.

Note #1: you can just use (negate odd?) and such, which is slightly longer but probably cleaner. And that kind of thing would work in Python as well. (Although function composition isn't all that popular in Python-land.)

Note #2: This implementation just reads the expression that follows ~, then feeds it to negate. So using a lambda works as well (but kind of defeats the purpose, which is conciseness).

By the way, the reader extension I showed a few days ago (to add Awk-like $N behavior) works, but I found out that it's not really how you're supposed to write it. At the point something like $3 is read, we're *reading* rather than *evaluating*, so it should really expand to something that returns the right result *when evaluated later*. So better code would be:

(set-read-syntax! #\$
  (lambda (port)
    (let* ((s (read-number port))
           (i (string->number s)))
      (list 'field i))))

In other words:

  • when the reader encounters $3, it expands it to the form (field 3) (but *does not* evaluate it at that point)
  • later on, we evaluate (field 3) and get the correct result, using whatever is in *fields* at the time of evaluation

So yeah, maybe this is like macros after all, but I don't know the correct term. :-)

There's another post like this coming up, using Arc as an excuse to tinker with the Scheme reader.

Update (2008-02-04): Here is similar code in Python, sort of. The additional "syntax" (which is actually, creating a class that defines "~", and then wrapping functions in it) seems to be more trouble than it's worth. If you do this kind of thing a lot in Python (probably not, but you never know), you might be better off using a simple negate function, e.g.

def negate(f):
    return lambda *args, **kwargs: not f(*args, **kwargs)

(I normally would not have used lambda, but after more than a month of Scheming, it's hard not to. :-)

:: Comments (3)

Arc, first impressions

I took a quick look at Arc. If nothing else, I want to see how it compares to Scheme. Here are a few remarks (in no particular order).

  • = is used for assignment rather than for equality testing. This may be confusing to people using other variants of Lisp. Possibly less so to users of languages that use the same operator. Still, inside parentheses it looks a bit odd: (= x 4)
  • Many abbrevations! Quite a few of them seem gratuitous; I'm used to def, but what about mac, rem, pr, prn, fn, o, and so on, some of which only shave off a character or two. It makes the resulting code a bit shorter, but whether it's clearer is a different issue.
  • The [ ... _ ... ] syntax seems useful. This is basically a shortcut for simple lambdas, e.g. [+ _ 1] is equivalent to (fn (x) (+ x 1)).
  • Odd assignment rules for lists and hash tables. E.g. if airports is a hash table, then (airports "Boston") looks up the key "Boston" in there. This also works in assignments: (= (airports "Orlando") 'mco). Lists work the same way, if I understand correctly. Personally, I think it's a bit confusing, and I'm not sure it's a good idea to overload this kind of syntax. (foo bar) can now mean: function application, hash table lookup, list lookup...
  • On the other hand, you get to do table lookups with map:
    (map airports '("San Francisco" "Orlando" "Paris"))
    => (sfo mco cdg)
  • On a side note, the use of hash tables for dictionaries, and the keys and vals functions to get keys and values, look suspiciously Pythonic. :-)
  • Compose functions with ":":
    odd:car is equivalent to (fn (x) (odd (car (x)))
  • "Negation" operator ~:
    ~odd? is equivalent to (fn (x) (not (odd? x)))
    N.B. It's not hard to do the same thing in Chicken... I already talked about defining custom literals in a separate post, and an implementation of ~ would use the same mechanism. More about this later.
  • if with more than 3 arguments is a nested if... e.g.
    (if a b c d e) => (if a b (if c d e))
  • Some forms were reinvented without parentheses, which is much clearer, IMHO. E.g.
    (let x 1 ...body...)
    (with (a 1 b 2) ...body...)
    Compare this to the parentheses-fest that is let in Scheme and Common Lisp. :-}

No "conclusion" at this point; I have only skimmed the tutorial and tinkered a bit with the REPL. Plus, it's likely to change anyway.

On forums, the biggest gripe people seem to have is the lack of Unicode support, and PG's apparent unwillingless to add it.

:: Comments (9)

Arc!

Who cares about the Florida primaries when there is some *real* news? Arc is out. (Took them long enough. :-) The official site is here. I will probably download it and give it a spin tonight, after wrestling through the tutorial, forum discussions, and comment threads on Reddit and Y Combinator.

More about this later, without a doubt.

(First thought: It's built on top of MzScheme. Can we do s/MzScheme/Chicken? :-)

:: Comments (1)

WordPress: first impressions

This is the first time that I use a server-side blogging tool for my weblog. For Tao of the Machine I first used Kaa, then Firedrop. Efectos Especiales used Firedrop as well, and Interstellar Overdrive used a weird command line based tool that I never bothered to make available, called IV. (All of these generate static HTML, which is then uploaded via FTP.) And so now I'm using WordPress.Hey, Dreamhost makes it extremely easy to install it, so I'm using it. :-) Also, I don't use my horrible HughesNet connection anymore, so online editing suddenly is feasible.

So far I like it. Mostly. It has its problems, but overall, the experience has been relatively painless.

WordPress has a large number of plugins, most of which are easy to install and use (like wp-table). It's also more configurable than I expected it to be, and it's easy to make (small) changes to its PHP code even if you don't actually know much PHP.

Editing in a browser is not ideal, but IMHO it's still preferable to using a tool like e.g. MarsEdit, since the web-based editor has features that external editors don't have. And I definitely don't want to write straight HTML.

That said, the WordPress editor is not optimal for inserting or editing code (using <pre>), and a few times so far it has managed to mess up my formatting (esp. when switching between the "Visual" and "Code" tabs). It also has the annoying habit of replacing "neutral" quotes (both single and double) with left and right versions, which isn't much of a problem in regular text, but it is when I want to display e.g. Python code (where an apostrophe should not look like a backtick).

(Fortunately, there's a way to change that behavior... in wp-includes/default-filters.php, comment out the lines that say

add_filter('bloginfo', 'wptexturize');

and the quotes show up normally. (via.))

All in all, I like it, probably more so than I initially thought. It beats having to write your own blogging tool. :-) I sometimes miss the flexibility of my home-grown systems, but then again I didn't use my macros *that* much.

:: Comments (1)

Defining custom literals in Chicken Scheme

In a previous post, I briefly pondered what a Scheme-based Awk-like tool would look like. Awk has a concise syntax to access fields; e.g. $1 is the first field in a line, etc. A similar tool written in Scheme would benefit from having such syntax as well.

So, I wondered how much work it would be to add it to Chicken. Let's say, something that maps $1 to (field 1) (assuming a function called field exists, of course). As it turns out, it's not much work at all, not even for someone who has never hacked the Scheme reader (that would be me :-).

First of all, we need to look at the set-read-syntax function, which is helpfully built into Chicken. It is used like this:

(set-read-syntax! <character>
  (lambda (port)
    ...read characters...
    ...return custom value...

...where <character> is the first character of the new literal. The lambda that follows can then do custom reading from port, resulting in data that can be manipulated at will. In this case, I want the literal to start with $, then read digits, and stop as soon as a non-digit is encountered. So my code should look something like this:

(set-read-syntax! #\$
  (lambda (port)
    (let* ((s (read-number port))
           (i (string->number s)))
      (field i))))

Except that Chicken doesn't have a read-number function. Fortunately, it's not hard to write one. Here's a version using read-token. (read-token reads a character at a time and tests it against a predicate, collecting characters that match the predicate, stopping as soon as one doesn't match, and returning the collected characters as a string.)

(define (number-char? c)
  (member c '(#\0 #\1 #\2 #\3 #\4 #\5 #\6 #\7 #\8 #\9)))

(define (read-number port)
  (read-token number-char? port))

Now let's test it with a dummy implementation of field.

(define *fields* '())

(define (field n)
  (cond
    ((< n 1) "")
    ((> n (length *fields*)) "")
    (else (list-ref *fields* (- n 1)))))

(set! *fields* (string-split "the quick brown fox jumps over the lazy dog"))

(printf "~a ~a ~a~n" $1 $9 $5)

...prints "the dog jumps". :-)

(And yeah, my code isn't perfect, but it's just for demonstration purposes.)

Much like Ruby's monkeypatching, defining custom literals is probably not something that should be used in libraries a lot, but it looks like it could be very useful in DSLs or tools like the one mentioned.

:: Comments (1)

Python vs Scheme: strings

Python and Scheme have different philosophies when it comes to strings. Scheme strings are mutable and consist of characters, which are a separate type. By contrast, Python's strings are immutable, and its "characters" are really strings with a length of one.

Also, Python uses both " " and ' ' for string literals, while Scheme only uses " ".

Aside from that, strings can be used in these languages in ways that are very similar (as opposed to e.g. C's strings which tend to involve memory allocation and pointer arithmetic). So in this post, I will be focusing on common string operations, and what they look like in both Python and Scheme.

In Python, all these strings operations work out of the box. In Scheme, some are provided by R5RS, while others are found in SRFI-13 (a very useful library which has a large number of non-trivial string operations), and yet others are included by Chicken (but not necessarily part of other Scheme implementations). In the examples below, I'm assuming Chicken with SRFI-13 imported.

Joining multiple strings »

Python has the very obvious + operator to concatenate two strings, something which won't work in Scheme; (+ "a" "b") is an error. It also has the butt-ugly str.join method to join a list of strings. Scheme has string-append (R5RS) and string-join (SRFI-13).

# Python
>>> "hello" + " " + "world"
'hello world'
>>> " ".join(['my', 'name', 'is', 'poison'])
'my name is poison'

;; Scheme
> (string-append "hello" " " "world")
"hello world"
> (string-join '("my" "name" "is" "poison") " ")
"my name is poison"

Getting the length »

These functions are very simple, but I'm mentioning them anyway because there might be surprises here for people coming from other languages. Python uses the len() function rather than a method (like e.g. Ruby and Io do). Scheme uses string-length rather than length (which only works on lists).

# Python
>>> len("koyaanisqatsi")
13

;; Scheme
> (string-length "koyaanisqatsi")
13

Substrings »

Python uses the [] syntax for indexing and slicing; it also accepts negative numbers (to count from the end of the string). Scheme has string-ref and substring, which work similarly, except they don't take negative values. (Note that string-ref returns a *character* rather than a one-length string.)

# Python
>>> s = "hello"
>>> s[0]
'h'
>>> s[2]
'l'
>>> s[1:3]
'el'
>>> s[-3:]
'llo'

;; Scheme
> (define s "hello")
> (string-ref s 0)
#\h
> (string-ref s 2)
#\l
> (substring s 1 3)
"el"

Comparing strings »

In Python, strings are compared with the usual == family of operators. Case matters; "a" does not compare equal to "A". Scheme, on the other hand, has a number of functions to do the comparison; string=? and friends for case-sensitive comparing like in Python, and the string-ci=? family for case-insensitive comparing.

# Python
>>> "abc" == "abc"
True
>>> "abc" == "ABC"
False
>>> "b" > "a"
True

;; Scheme
> (string=? "abc" "ABC")
#f
> (string-ci=? "abc" "ABC")
#t
> (string>? "b" "a")
#t

SRFI-13 also provides equivalents for Python's useful startswith() and endswith() methods:

> (string-prefix? "He" "Herbert")
#t
> (string-suffix? "tt" "Abbott")
#t

Changing case »

Speaks for itself. Note that R5RS defines char-upcase and char-downcase, but not string-upcase or string-downcase (those are in SRFI-13).

>>> "kibbles and bits".upper()
'KIBBLES AND BITS'
>>> "KIBBLES AND BITS".lower()
'kibbles and bits'
>>> "kibbles and bits".capitalize()
'Kibbles and bits'
>>> "kibbles and bits".title()
'Kibbles And Bits'

;; Scheme
> (string-upcase "kibbles and bits")
"KIBBLES AND BITS"
> (string-downcase "KIBBLES AND BITS")
"kibbles and bits"
> (string-titlecase "kibbles and bits")
"Kibbles And Bits"

Splitting »

Splitting a string into a list of smaller strings is a common thing to do in high-level languages. Luckily, for common cases, we don't have to resort to regular expressions. Python uses the split() method, Scheme has string-tokenize (SRFI-13) and string-split (Chicken).

# Python
>>> "a few good men".split()
['a', 'few', 'good', 'men']
>>> "abracadabra".split("b")
['a', 'racada', 'ra']

;; Scheme
> (string-tokenize "a few good men")  ;; SRFI-13
("a" "few" "good" "men")
> (string-split "a few good men")     ;; Chicken built-in
("a" "few" "good" "men")
> (string-split "abracadabra" "b")
("a" "racada" "ra")

Trimming »

To trim characters from the left and/or right side of a string, Python uses the lstrip (from the left), rstrip (from the right) or strip (both sides) methods. Somewhat asymmetrically, in Scheme (or, more precisely, SRFI-13) these functions are called string-trim (from the left), string-trim-right (from the right) and string-trim-both (both sides).

Both Python and Scheme allow you to specify the character(s) that need to be stripped. By default, whitespace is removed, as this seems to be the most common use case.

# Python
>>> s = "  i like cookies  "
>>> s.strip()
'i like cookies'
>>> s.lstrip()
'i like cookies  '
>>> s.rstrip()
'  i like cookies'
>>> "xxxhi!xxx".strip("x")
'hi!'

;; Scheme
> (define s "  i like cookies  ")
> (string-trim s)
"i like cookies  "
> (string-trim-right s)
"  i like cookies"
> (string-trim-both s)
"i like cookies"
> (string-trim-both "xxxhi!xxx" #\x)
"hi!"

Looping »

In Python, you can loop over a string (using for) or turn it into a list, but what you get is essentially a list of strings with length one. In Scheme, you get characters. Use string->list to get a list of characters, and string-map to map one string to another (much like the regular map, but it takes and returns a string).

# Python
>>> for c in "hello": print c,
...
h e l l o
>>> list("hello")
['h', 'e', 'l', 'l', 'o']

;; Scheme
> (string->list "hello")
(#\h #\e #\l #\l #\o)
> (for-each
>   (lambda (c) (printf "~a! " c))
>   (string->list "hello"))
h! e! l! l! o!
> (string-map char-upcase "hello")
"HELLO"

Searching »

Python has several ways to search strings for contents... like the find/rfind methods (and their index/rindex counterparts) to find the index of a matching substring, and the in operator if you just want to know if a string has a certain substring, but don't need to know where exactly it starts.

You can do the same things in Scheme, assuming you use SRFI-13, as R5RS does not define any of this. string-index searches for a character (or a character set or a predicate), string-contains searches for a substring. When found, it returns the index, otherwise #f (which is useful because it allows one to write (if (string-contains s1 s2) ...)).

# Python
>>> "lemon-flavored jellibeans".find("e")
1
>>> "lemon-flavored jellibeans".rfind("e")
21
>>> "lemon-flavored jellibeans".find("el")
16
>>> "lemon-flavored jellibeans".find("xyz")
-1
>>> "el" in "lemon-flavored jellibeans"
True

;; Scheme
> (string-index "lemon-flavored jellibeans" #\e)
1
> (string-index-right "lemon-flavored jellibeans" #\e)
21
> (string-contains "lemon-flavored jellibeans" "el")
16
> (string-contains "lemon-flavored jellibeans" "xyz")
#f

Replacing »

Python's replace() method is very easy: simply specify the substring that needs to be replaced, and its replacement. By contrast, SRFI-13's string-replace is more sophisticated. It takes a string, a replacement string, and start/end indices that indicate what part of the string needs replaced. See the example below.

# Python
>>> "I like cookies".replace("cookie", "hot dog")
'I like hot dogs'

;; Scheme
> (string-replace "i like cookies" "hot dog" 7 13)
"i like hot dogs"

I don't know if there's a version that is easier to use floating around somewhere (in a SRFI or otherwise), but it's not so hard to write something that emulates the Python behavior:

(define (string-replace-v2 s before after)
  (let ((idx (string-contains s before)))
    (if idx
        (string-replace s after idx (+ idx (string-length before)))
        s)))

Also, Chicken has string-translate*, which works for our purposes, but is used with a table of elements to be replaced:

> (string-translate* "i like cookies"
>   '(("cookie" . "hot dog")))
"i like hot dogs"

:::

This has become a long post, longer than I intended, and there are still many things I haven't even touched upon yet... like Unicode, or the fact that some of the aforementioned functions have equivalents that change the string in-place, rather than returning a new string. Anyway, this wasn't meant to be a complete reference; it's more of a starting point, or a quick way to look up "I can do X in Python, how do I do it in Scheme?"

Further reading:

:: Comments (5)

Interlude: Scheme command-line tool idea

While I'm preparing a long Python-vs-Scheme post, here's a bit of filler. :-)

Check this. No, not the part about Common Lisp; what I'm interested in right now, is the snippet of Awk code, and the question whether a similar tool can be written in Scheme.

To put it more precisely, would it be possible to have a command-line tool, using Scheme as its language (and probably written in Scheme as well), that lets users write one-line queries like that?

It would be a cool thing to have, although I'm not sure if nested parentheses mix so well with the command line. Writing Scheme code in an editor is different from writing it at a shell prompt.

The Awk example could look something like this:

<name> '(begin (set-fs! ":")) (if (equal? $6 "/sbin/nologin") (print $1))' /etc/passwd

(It would have to use some reader manipulation to allow for expressions like $6 meaning (field 6), or something like that. Chicken already uses the $, so maybe a different syntax would be preferable.)

In any case, I am pondering this. The idea is to have a tool that is powerful, relatively easy to use, and still Scheme-y. Of course, the world probably doesn't need a new Awk, but I am exploring this idea as a coding/design exercise.

:: Comments (2)

Chandler

Chandler is pining for the fjords. Carlos Perez blames Python; Ned Batchelder comes to Python's defense.

First of all: I have always thought that Chandler was dubious marketing for Python, to be honest. When it was first announced, much was made of the fact that it was going to use Python as the development language... and then it just sat there for years, showing very little progress.

Upon reading Dreaming In Code, I got the impression that programming languages (whether Python or Java) are not to blame, but rather the fact that the project's goals were very vague and unclear, right from the start (which caused a slew of other, related, problems). Writing a revolutionary PIM is a great idea... but nobody knew what it should look like exactly, much less what it should do.

Python is great, but it doesn't design the program for you. :-} As such, I don't think it has anything to do with Chandler's failure (and neither does the static-vs-dynamic typing issue). If you don't know what you want, you can have the most productive language known to man, but it won't do you any good.

(And, of course, the fact that most of the developers were unfamiliar with Python, did not exactly push the project forward either...)

:: Comments

Useful WordPress plugin: wp-table

I've been wanting to display tables in my blog for a while, but the WordPress WYSIWYG editor isn't really suitable for that. Even if you write straight HTML, it will try to modify your table tags. :-(

Fortunately there's a cool plugin that makes it easy to create tables: wp-table. It adds a section to the WordPress admin where you can easily add and manage tables, then include them in your posts with [TABLE=ID].

Installing a WordPress plugin is easy too... Just stick the appropriate directory into wp-content/plugins, then activate the plugin in the admin. Sw33t.

Here's what such a table looks like: Quick Guide to SRFIs (very much a work in progress, by the way, and currently mostly for personal use -- I add SRFIs as I go).

:: Comments

« Previous entries