I have resumed my “Python vs Scheme” series. The latest installment can be found on my new blog. (I’m announcing it here because otherwise nobody would notice. <0.5 wink>)
As always, comments and corrections are welcome. ^_^
I have resumed my “Python vs Scheme” series. The latest installment can be found on my new blog. (I’m announcing it here because otherwise nobody would notice. <0.5 wink>)
As always, comments and corrections are welcome. ^_^
Python has a useful idiom, that allows one to use the same file as both a module and a program. Consider this simple example:
# foo.py def bar(x): print "bar says:", x if __name__ == "__main__": bar(42)
The if __name__ == “__main__” clause is only executed if foo.py is run as the “main program”. In other words, we can do both
$ python foo.py bar says: 42
$ python Python 2.5.1 (r251:54869, Apr 18 2007, 22:08:04) [GCC 4.0.1 (Apple Computer, Inc. build 5367)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import foo >>> foo.bar(33) bar says: 33
Although admittedly a bit of a hack, this construct is well-known and often used.
Now, on to Chicken Scheme. Can we do the same? It required a bit of poking around in documentation and mailing list, but it appears the answer is yes.
;; foo.scm (define (foo x) (print "foo says: " x)) (define (main args) (print "args: " args) (foo 42))
Now, we can run this as a script with csi -ss (which looks for a function called main and automagically calls it):
$ csi -ss foo.scm args: () foo says: 42
Notice that the main function has an argument args, which contains the command line arguments passed to the program:
$ csi -ss foo.scm 1 2 3 args: (1 2 3) foo says: 42
We can also import foo.scm from within an interactive session, in which cases main is not called:
$ csi CHICKEN Version 3.0.0 - macosx-unix-gnu-x86 [ manyargs dload ptables applyhook ] (c)2000-2008 Felix L. Winkelmann compiled 2008-03-05 on niflheim.local (Darwin) ; loading /Users/zephyrfalcon/.csirc ... ; loading /usr/local/lib/chicken/3/readline.so ... #;1> (use foo) ; loading ./foo.scm ... #;2> (foo 101) foo says: 101
But, but! Isn’t Chicken primarily a compiler? Does the above work too when using csc rather than csi? Actually it does, but the invocation is different. I use the following:
$ csc foo.scm -postlude "(main (cdr (argv)))" $ ./foo args: () foo says: 42 $ ./foo 1 2 3 args: (1 2 3) foo says: 42
The -postlude option can be used to specify code that runs when the executable is called. (Actually, the official explanation is: “Add EXPRESSIONS after all other toplevel expressions in the compiled file. This option may be given multiple times. Processing of this option takes place after processing of -epilogue.”)
I use (main (cdr (argv))) as the postlude expression, which seems to pass command line arguments the same way as csi -ss passes them to the main function (although there might be a catch that I’m not aware of). The cdr is necessary because the first item of the list returned by (argv) is the name of the calling program (e.g. “./foo”).
(If there’s a better way, please let me know, as my knowledge about the compiler is limited.)
Next up: parsing command line arguments in both Python and Scheme…
Just discovered something neat. In Python, if a function has arguments that have a default value, then those defaults are bound when the function is defined. So the following function, when called with no arguments, always returns the same value:
>>> def f(x=random.randrange(0, 100)): return x ... >>> f() 32 >>> f() 32 >>> f() 32
…because the default value for x is determined when f is defined, rather than when it’s called.
And this is not allowed at all:
>>> def z(a, b=a): print a, b ... Traceback (most recent call last): File "<stdin>", line 1, in <module> NameError: name 'a' is not defined
I assumed that these rules would be the same in Chicken, but it turns out that this is not the case. This allows for some cool constructs that simply aren’t possible in Python. Like the example with the random number:
> (define (f #!optional (x (random 100))) x) > (f) 23 > (f) 89 > (f) 97
It appears that (random 100) is computed whenever f is called, rather than when it’s defined. We can also refer to other arguments in this default expression:
> (define (g a #!optional (b (+ a 10))) > (list a b)) > (g 3) (3 13) > (g 40) (40 50)
Good to know. This behavior is intentional rather than accidental, judging from the Extensions to the standard section in the user manual.
Python and Scheme have different philosophies when it comes to strings. Scheme strings are mutable and consist of characters, which are a separate type. By contrast, Python’s strings are immutable, and its “characters” are really strings with a length of one.
Also, Python uses both ” ” and ‘ ‘ for string literals, while Scheme only uses ” “.
Aside from that, strings can be used in these languages in ways that are very similar (as opposed to e.g. C’s strings which tend to involve memory allocation and pointer arithmetic). So in this post, I will be focusing on common string operations, and what they look like in both Python and Scheme.
In Python, all these strings operations work out of the box. In Scheme, some are provided by R5RS, while others are found in SRFI-13 (a very useful library which has a large number of non-trivial string operations), and yet others are included by Chicken (but not necessarily part of other Scheme implementations). In the examples below, I’m assuming Chicken with SRFI-13 imported.
Python has the very obvious + operator to concatenate two strings, something which won’t work in Scheme; (+ “a” “b”) is an error. It also has the butt-ugly str.join method to join a list of strings. Scheme has string-append (R5RS) and string-join (SRFI-13).
# Python >>> "hello" + " " + "world" 'hello world' >>> " ".join(['my', 'name', 'is', 'poison']) 'my name is poison' ;; Scheme > (string-append "hello" " " "world") "hello world" > (string-join '("my" "name" "is" "poison") " ") "my name is poison"
These functions are very simple, but I’m mentioning them anyway because there might be surprises here for people coming from other languages. Python uses the len() function rather than a method (like e.g. Ruby and Io do). Scheme uses string-length rather than length (which only works on lists).
# Python >>> len("koyaanisqatsi") 13 ;; Scheme > (string-length "koyaanisqatsi") 13
Python uses the  syntax for indexing and slicing; it also accepts negative numbers (to count from the end of the string). Scheme has string-ref and substring, which work similarly, except they don’t take negative values. (Note that string-ref returns a *character* rather than a one-length string.)
# Python >>> s = "hello" >>> s 'h' >>> s 'l' >>> s[1:3] 'el' >>> s[-3:] 'llo' ;; Scheme > (define s "hello") > (string-ref s 0) #\h > (string-ref s 2) #\l > (substring s 1 3) "el"
In Python, strings are compared with the usual == family of operators. Case matters; “a” does not compare equal to “A”. Scheme, on the other hand, has a number of functions to do the comparison; string=? and friends for case-sensitive comparing like in Python, and the string-ci=? family for case-insensitive comparing.
# Python >>> "abc" == "abc" True >>> "abc" == "ABC" False >>> "b" > "a" True ;; Scheme > (string=? "abc" "ABC") #f > (string-ci=? "abc" "ABC") #t > (string>? "b" "a") #t
SRFI-13 also provides equivalents for Python’s useful startswith() and endswith() methods:
> (string-prefix? "He" "Herbert") #t > (string-suffix? "tt" "Abbott") #t
Speaks for itself. Note that R5RS defines char-upcase and char-downcase, but not string-upcase or string-downcase (those are in SRFI-13).
>>> "kibbles and bits".upper() 'KIBBLES AND BITS' >>> "KIBBLES AND BITS".lower() 'kibbles and bits' >>> "kibbles and bits".capitalize() 'Kibbles and bits' >>> "kibbles and bits".title() 'Kibbles And Bits' ;; Scheme > (string-upcase "kibbles and bits") "KIBBLES AND BITS" > (string-downcase "KIBBLES AND BITS") "kibbles and bits" > (string-titlecase "kibbles and bits") "Kibbles And Bits"
Splitting a string into a list of smaller strings is a common thing to do in high-level languages. Luckily, for common cases, we don’t have to resort to regular expressions. Python uses the split() method, Scheme has string-tokenize (SRFI-13) and string-split (Chicken).
# Python >>> "a few good men".split() ['a', 'few', 'good', 'men'] >>> "abracadabra".split("b") ['a', 'racada', 'ra'] ;; Scheme > (string-tokenize "a few good men") ;; SRFI-13 ("a" "few" "good" "men") > (string-split "a few good men") ;; Chicken built-in ("a" "few" "good" "men") > (string-split "abracadabra" "b") ("a" "racada" "ra")
To trim characters from the left and/or right side of a string, Python uses the lstrip (from the left), rstrip (from the right) or strip (both sides) methods. Somewhat asymmetrically, in Scheme (or, more precisely, SRFI-13) these functions are called string-trim (from the left), string-trim-right (from the right) and string-trim-both (both sides).
Both Python and Scheme allow you to specify the character(s) that need to be stripped. By default, whitespace is removed, as this seems to be the most common use case.
# Python >>> s = " i like cookies " >>> s.strip() 'i like cookies' >>> s.lstrip() 'i like cookies ' >>> s.rstrip() ' i like cookies' >>> "xxxhi!xxx".strip("x") 'hi!' ;; Scheme > (define s " i like cookies ") > (string-trim s) "i like cookies " > (string-trim-right s) " i like cookies" > (string-trim-both s) "i like cookies" > (string-trim-both "xxxhi!xxx" #\x) "hi!"
In Python, you can loop over a string (using for) or turn it into a list, but what you get is essentially a list of strings with length one. In Scheme, you get characters. Use string->list to get a list of characters, and string-map to map one string to another (much like the regular map, but it takes and returns a string).
# Python >>> for c in "hello": print c, ... h e l l o >>> list("hello") ['h', 'e', 'l', 'l', 'o'] ;; Scheme > (string->list "hello") (#\h #\e #\l #\l #\o) > (for-each > (lambda (c) (printf "~a! " c)) > (string->list "hello")) h! e! l! l! o! > (string-map char-upcase "hello") "HELLO"
Python has several ways to search strings for contents… like the find/rfind methods (and their index/rindex counterparts) to find the index of a matching substring, and the in operator if you just want to know if a string has a certain substring, but don’t need to know where exactly it starts.
You can do the same things in Scheme, assuming you use SRFI-13, as R5RS does not define any of this. string-index searches for a character (or a character set or a predicate), string-contains searches for a substring. When found, it returns the index, otherwise #f (which is useful because it allows one to write (if (string-contains s1 s2) …)).
# Python >>> "lemon-flavored jellibeans".find("e") 1 >>> "lemon-flavored jellibeans".rfind("e") 21 >>> "lemon-flavored jellibeans".find("el") 16 >>> "lemon-flavored jellibeans".find("xyz") -1 >>> "el" in "lemon-flavored jellibeans" True ;; Scheme > (string-index "lemon-flavored jellibeans" #\e) 1 > (string-index-right "lemon-flavored jellibeans" #\e) 21 > (string-contains "lemon-flavored jellibeans" "el") 16 > (string-contains "lemon-flavored jellibeans" "xyz") #f
Python’s replace() method is very easy: simply specify the substring that needs to be replaced, and its replacement. By contrast, SRFI-13′s string-replace is more sophisticated. It takes a string, a replacement string, and start/end indices that indicate what part of the string needs replaced. See the example below.
# Python >>> "I like cookies".replace("cookie", "hot dog") 'I like hot dogs' ;; Scheme > (string-replace "i like cookies" "hot dog" 7 13) "i like hot dogs"
I don’t know if there’s a version that is easier to use floating around somewhere (in a SRFI or otherwise), but it’s not so hard to write something that emulates the Python behavior:
(define (string-replace-v2 s before after) (let ((idx (string-contains s before))) (if idx (string-replace s after idx (+ idx (string-length before))) s)))
Also, Chicken has string-translate*, which works for our purposes, but is used with a table of elements to be replaced:
> (string-translate* "i like cookies" > '(("cookie" . "hot dog"))) "i like hot dogs"
This has become a long post, longer than I intended, and there are still many things I haven’t even touched upon yet… like Unicode, or the fact that some of the aforementioned functions have equivalents that change the string in-place, rather than returning a new string. Anyway, this wasn’t meant to be a complete reference; it’s more of a starting point, or a quick way to look up “I can do X in Python, how do I do it in Scheme?”
In short, Python’s dictionaries are mutable objects that associate unique keys with values. There is special syntax to create them.
Now, going by R5RS, Scheme doesn’t even have anything similar. All it has is three closely related functions, that look up pairs in a list, based on a “key” which is matched to the first element of each pair. These functions are assq, assv and assoc. (See here in R5RS.)
Naturally, it’s possible to write extensions in Scheme that look more like Python’s dict (or Ruby’s Hash, etc), and I’m sure people have done so; for example, SRFI-69 defines hash tables. But for now, let’s see what we can do with the bare-bones approach.
Its usage is simple: you define a list of pairs, possibly augmenting them by consing new pairs onto it, or deleting elements from it. (The order doesn’t really matter.) Then you use the aforementioned functions to look up “keys” (matched to the first element in each pair). If not found, #f is returned, otherwise the matching pair.
That’s right; the whole pair is returned, not just the second element of the pair. By doing so, Scheme sidesteps the problem that some languages have (e.g. Ruby); it either returns a pair (found) or #f (not found), so there can never be any confusion whether the key was found or not. By contrast, in Ruby, if myhash[value] returns nil, that could mean that the value was found and that its associated value was nil, *or* that it was not found at all. (Python doesn’t have this problem either; it raises a KeyError exception if the key is not found; in addition, the Ruby behavior can be emulated with the dict.get() method.)
Anyway, here’s an example (R5RS only but we’re secretly assuming that filter exists):
(define language-designers '((guido python) (matz ruby) (rasmus php) (larry perl))) ; add some... (define language-designers (cons '(felix chicken) language-designers)) ; I don't like PHP; remove it :-) (define language-designers (filter (lambda (pair) (not (equal? (cadr pair) 'php))) language-designers)) (print (assoc 'guido language-designers)) ; => (guido python) (print (assoc 'hans language-designers)) ; => #f
Note how the usage is completely different from Python’s dicts. This can be written differently — hell, OF COURSE it can be written differently, it’s Scheme! :-) But let’s stick with this approach for a minute.
(I’m using assoc here, which compares key and first element using the equal? predicate. assq and assv basically do the same thing, using the eq? and eqv? predicates, respectively. Equality testing in Scheme will be dealt with in yet another forthcoming post…)
Basically, this is all you need. Since a “dictionary” is just a list of pairs, all the usual list operators apply, and can be used to write any Python-esque dict operations fairly easily: keys, has_key, adding and removing using , etc. (Doing so is left as an exercise for the reader. ;-)
Fortunately, the author of SRFI-1 (a list library) recognized the need for such functions, and supplied a few of them. Using the SRFI, we could write:
(define language-designers (alist-cons 'felix chicken language-designers)) (define language-designers (alist-delete 'rasmus language-designers))
… which is at least a bit clearer.
(More about Scheme lists, and SRFI-1 which is *very* necessary, in a separate post.)
Python has a nifty feature called list comprehensions. Originally borrowed from Haskell, this is a (mostly) compact way to generate lists, without having to resort to the map and filter functions (which are often considered less clear and tend to rely on Python’s underpowered lambda).
Scheme, being a functional language, has an entirely different attitude toward map, filter and friends. Their use is encouraged, rather than being downplayed. In spite of that, there are situations where list comprehensions (or some form thereof) would be useful. So, SRFI-42 introduces “eager comprehensions”.
Rather than providing one construct, SRFI-42 has a number of forms, some of which seem to exist for efficiency reasons. For now, I’m just going to look at list-ec, which (in its most basic form) is a close cousin of Python’s list comprehensions.
Let’s run a simple example in Chicken:
> (use syntax-case) ; loading /usr/local/lib/chicken/3/syntax-case.so ... ; loading /usr/local/lib/chicken/3/syntax-case-chicken-macros.scm ... > (use srfi-42) ; loading /usr/local/lib/chicken/3/srfi-42.scm ... ; loading /usr/local/lib/chicken/3/srfi-42-support.so ... > (list-ec (: i 5) (* i i)) (0 1 4 9 16)
In Python, this would look like:
>>> [i*i for i in range(5)] [0, 1, 4, 9, 16]
In this case, we could easily write this with a map… if Scheme had something similar to Python’s range function, which it doesn’t. In fact, the lack of such a construct seems to have been a major motivation for writing SRFI-42. The author states:
“The origin of this SRFI is my frustration that there is no simple [form] for the list of integers from 0 to n-1. With this SRFI it is
(list-ec (: i n) i).”
Moreover, much like in Python, eager comprehensions are capable of more powerful expressions, that are not so easily written using map. Like this one, which isn’t terribly complex:
> (list-ec (: n 1 4) (: i n) (list n i)) ((1 0) (2 0) (2 1) (3 0) (3 1) (3 2)) # Python: >>> [(n,i) for n in range(1, 4) for i in range(n)] [(1, 0), (2, 0), (2, 1), (3, 0), (3, 1), (3, 2)]
Much, much more is possible using SRFI-42, most of which I don’t understand yet :-), so for now I’m sticking to these simple examples. There will probably be a part II to this post, at some point. However, as a teaser, here’s a way to write Python’s enumerate (that works on strings and lists and probably some other types):
> (define (enumerate seq) > (list-ec (: x (index i) seq) (list i x))) > (enumerate "hello") ((0 #\h) (1 #\e) (2 #\l) (3 #\l) (4 #\o)) > (enumerate '(guido larry matz)) ((0 guido) (1 larry) (2 matz))
In Python, it’s common to format string output using the % operator. (There are other ways, but this is the oldest and most widely used.) A few examples of frequently used string substitutions:
# integer >>> "I ate %d hamburgers" % (4,) 'I ate 4 hamburgers' # string >>> "Hello, my name is %s" % ("John Doe",) 'Hello, my name is John Doe' # float rounded to 2 decimals >>> "Execution took %0.2f seconds" % (12.5678,) 'Execution took 12.57 seconds' # compound objects >>> "Output: %s" % (locals().keys(),) "Output: ['__builtins__', '__name__', '__doc__']" # repr >>> "Invalid argument: %r" % ("foobar",) "Invalid argument: 'foobar'" # multiple values >>> "Brought to you by the letters %s, %s and %s" % ("P", "Q", "R") 'Brought to you by the letters P, Q and R'
Now, on to Scheme. As far as I can tell, R5RS has nothing of the sort. Which is somewhat surprising, or then again maybe not, when you realize that it’s not so difficult to write a formatting function that works for most cases.
Which is exactly what Chicken provides, in the extras unit. The printf, sprintf, fprintf and format functions all take a format string and a number of arguments (so no messing with tuples like Python does — although that is a consequence of using an operator rather than a function).
While printf and friends are underpowered, they work for most cases:
> (printf "Once again, I ate ~a hamburgers.~n" 4) Once again, I ate 4 hamburgers. > (printf "I saw ~a at the ~a ~a, having ~a ~as.~n" "Fred" 'foo 'bar '17 'martini) I saw Fred at the foo bar, having 17 martinis.
From what I’ve seen, the ~a and ~n directives are the most common; there are a few more, like ~x to display a number as hexadecimal, and ~\n to skip all whitespace in the string until the next non-format character. (Note that this actually *extends* SRFI-28, which describes a form of string formatting that is rather bare-bones.)
As said, this works for most cases, but not when you, for example, want to display a float with 2 decimals. printf has no equivalent for Python’s “%0.2f”.
Fortunately, there are extensions that do allow this. The format egg, for instance, provides Common Lisp-like format strings.
Installing the egg is easy, more about which in a separate post; for now, suffice to say that
$ sudo chicken-setup format
should probably do the trick. Then, use the egg:
> (use format) ; loading /usr/local/lib/chicken/3/format.so ... > (format #f "The pie was ~,2F cm thick.~%" 3.1415) "The pie was 3.14 cm thick.\n"
(It provides a function format that, when imported, overrides the built-in format function. printf and friends are not affected, though.)
More information about the directives can be found in the documentation of the format egg, but basically they work the same way as the directives found in printf & co. There’s just more of them, with more options.
In part I, I looked at the handling of function arguments as specified in R5RS. Chicken adds three more ways to handle special arguments, using the keywords #!optional, #!rest and #!key. (Here’s the relevant page in the Chicken documentation.)
The way I understand it, #!rest is just another way to collect optional arguments in a list. For our purposes, the following code is equivalent to (f a b . args):
> (define (r a b #!rest args) (list a b args)) > (r 1 2) (1 2 ()) > (r 1 2 3) (1 2 (3))
Optional arguments can also be specified separately (rather than collecting them all in a list). For this, use the #!optional keyword, followed by a list of names. Arguments passed to the function are associated with these names based on position. If not specified, the name defaults to #f.
> (define (o a #!optional b c) (list a b c)) > (o 1) (1 #f #f) > (o 2 'fred) (2 fred #f) ;; roughly equivalent to Python: ;; def o(a, b=None, c=None): ...
We can also specify defaults:
> (define (p #!optional (a 42) (b 'cookie)) (list a b)) > (p) (42 cookie) > (p 103) (103 cookie) ;; roughly equivalent to Python: ;; def p(a=42, b="cookie"): ...
Again, values are associated with names based on position.
It would be useful if we could specify a parameter’s name. What if, in the above example, we want to override b but not a? In that case, we need to use #!key. This works much like #!optional, except that we can specify names (using #:name syntax). In fact, we *must* specify names, because positional arguments are ignored (unless we also specify #!optional or #!rest).
> (define (k #!key (a 42) (b 'cookie)) (list a b)) > (k) (42 cookie) > (k 1) (42 cookie) ;; neither a nor b is 1 > (k #:b 'possum) (42 possum) > (k #:b 'possum #:a 99) ; order of args doesn't matter (99 possum) > (k b: 'possum) ; this is also allowed (42 possum)
So, in order to set a value for a, we must use #a: value in the function call. value: is also allowed.
Note if a function specifies both #!key and #!rest, keyword arguments passed to the function will show up in the rest argument as well, no matter whether there’s a matching name in #!key or not.
> (define (z #!rest rest #!key (a 42) (b 'cookie)) (list a b rest)) > (z) (42 cookie ()) > (z 1 2) (42 cookie (1 2)) > (z #:b 'soup 52) (42 soup (b: soup 52)) > (z #:a 10 #:q 129) (10 cookie (a: 10 q: 129))
In fact, using just #!rest, we can emulate Python’s **kwargs construct (sort of):
> (define (y #!rest r) r) > (y #:a 3) (a: 3) > (y 1 2 3 #:foo 'bar) (1 2 3 foo: bar)
(“Parsing” this list of rest args requires a bit of special code to collect the pairs; at this point, I’m not sure if Chicken provides such a function out of the box.)
By the way, we can pass the resulting list to apply without problems:
> (apply y '(1 2 3)) (1 2 3) > (apply y '(1 2 3 foo: bar)) (1 2 3 foo: bar) > (apply y '(1 2 3 #:foo bar)) (1 2 3 foo: bar)
What I like about the Chicken construct is that it doesn’t mix up positional and keyword arguments (unlike Python, although there’s a PEP to fix this in Python 3000).
Anyway, while all this is powerful, it’s probably generally a good idea not to make argument lists too complicated. (The same is true in Python, by the way.)
Python supports various ways to handle arguments passed to functions:
# regular def f(a, b, c): ... # default arguments def f(a, b=3, c=100): ... # any number of positional parameters def f(x, *args): ... # any number of keyword arguments def f(y, **kwargs): ... # or a mixture! def f(self, *args, **kwargs): ...
Scheme supports a few similar forms, and what it doesn’t have is complemented by Chicken. Let’s have a look.
The “regular” argument list looks like this:
> (define (f a b c) (list a b c)) > (f 1 2 3) (1 2 3) > (f 3 4) ...error: not enough arguments... ;; (Python) def f(a, b, c): ...
f takes three arguments, no more, no less.
> (define (g a b . rest) (list a b rest)) > (g 1 2) (1 2 ()) > (g 1 2 3) (1 2 (3)) > (g 1 2 3 4 5 6 7) (1 2 (3 4 5 6 7)) ;; (Python) def g(a, b, *rest): ...
g takes two mandatory arguments, and any number of optional arguments, which are stored in the list rest. So, we can call (g 1 2) in which case rest is empty, but we can also call it as (g 1 2 3 4) which gives rest the value (3 4).
If we want a function that takes any number of arguments, with no mandatory ones, we use the following:
> (define (h . args) (list args)) > (h) (()) > (h 1 2 3) ((1 2 3)) ;; (Python) def h(*args): ...
Note, however, that this doesn’t work for a lambda (because (lambda (. args)) is not valid syntax). In that case, use args rather than an argument list:
> (define i (lambda args (list args))) > (i) (()) > (i 1 2 3) ((1 2 3))
This is all the argument handling that Scheme has to offer (in R5RS; read more about it here). Chicken has several extensions to this standard, more about which tomorrow.