Saturday, May 21, 2011

Clojure Lists and Forms

I'm working through Programming Clojure by Stuart Halloway.  I am at the middle of chapter 2 (exploring Clojure).  The syntax is different to anything I have used before and it's little hard to understand at times so I have decided to take time to do some background research.  


As usual, Google throws up a page from the fountain of all knowledge - Wikipedia's article on Lisp.


Lists

Clojure is a Lisp implementation (aka "a Lisp").  The name Lisp derives from "LISt Processing" because in a Lisp, everything is represented as a list like this one.
(foo 1 2 3)


The list is enclosed in parenthesis and the items in the list (known as atoms) are separated by spaces.  The first atom is an operator (usually a function call; foo in this case) and the remaining atoms (1,2 and 3) are arguments to the function.  The bracket notation is known as an S-expression and the use of the first atom in the list to represent an operator is called prefix notation.  Prefix notation provides a very regular syntax that make it easy to generate code for metaprogramming.  


So, if we know the str function creates a string from it's arguments, we can write 
(str "hello " " world") 
=> hello world


This looks quite similar to most programming languages, but it looks a bit odd when you see a arithmetic operation for the first time
(+ 1 2 3) 
=> 6


Most languages use infix operators for arithmetic so in Java or Ruby this s-expression would be 
1 + 2 + 3 


Clojure runs on the JVM and it also provides direct access java.lang.  You can call java functions by prefixing the operator atom with a dot.
"foo".toUpperCase();
becomes 
(.toUpperCase "foo")
=> FOO

As you would expect, s-expressions can be nested to any depth:
(str (str "hello" " world") (str " from" " me"))
=> "hello world from me"
(+ (- 3 (* 2 5)) 8)
=> 1

Forms

Lisp code is read by a reader that can be programmed (in most Lisps - see below) using macros.  These macros are pretty sophisticated and go beyond the pre-processor templating features in languages like C.  For example, Clojure implements a macro that tells the reader to ignore everything after a semi-colon  - in other words, a line comment.

(+ 1 2 3)  ; Comment: Adds 3 numbers together

I would need to dig deeper to find out, but I can image there *may* have been a time when everything in lisp was represented as an s-expression with a bunch of reader macros to provide shortcuts and "syntax".  I'm sure that would have been possible, but it would have severely limited portability because I would need to have your macros to run your code, and they might conflict with mine.

Thankfully, Clojure provides proper syntax for important constructs and a limited set of reader macros that cannot be extended.  The syntax elements are known as forms


Form
Usage
Syntax
Example
Boolean
Conditional logic
true, false
true, false
Character
A single character
\<char>
\a
Keyword
An constant value that refers to itself - like a Symbol in Ruby
:<name>
:foo
List
S-Expressions, lists of data
(*<atoms>)
(str "a" "b")
Map
Hashes
{key1 value1, key2 value 2}
{:name "Bill", :age 42} 
Nil
Nil
nil
nil
Number
Integers, decimals
*digits[.*digits]
100, 12.45
Set
A set of non-duplicate values
#{<value1> <value2>}
#{1 "a" :foo}
String
A (java) string
"<text>" 
"foo"
Symbol
A handle on an entity - e.g. a function
*<text>
user/foo, java.lang.String 
Vector
An ordered sequence of data separated by spaces
[<value1> <value2>]
[1 :bar "x" 4.3 \a true]


I don't know whether these are native to Clojure or borrowed from other Lisps, but it's not really important to my learning right now.

The usage for these forms is pretty much as you would expect.  We've already seen the string and number literals in use above.  I won't document them there because there are plenty of Internet resources out there.  

* Chapter 2 of Programming Clojure covers all the forms in detail

Note though that clojure is a dynamic language so you can mix an match different types within a data structure.

The conj function pushes items into a vector
(conj [1 2] 3)
=> [1 2 3]

We can also mix up the types
(conj [] 1 2.3 \s "foo" nil true)
=> [1 2.3 \s "foo" nil true]

Learnings

The important thing to remember is that everything is an s-expression.  Sometimes it can be hard to see in more complex expressions but you can usually figure out what is going on if you break everything down into a list and look for the s-expression.

No comments:

Post a Comment