stats

Read Me Clojure, Pt. 3

Basic Constructs

Image

The specific syntax of Lisp dialects allows one to precisely define and distinguish basic constructs, add new syntactic elements, and even transform program code while it is running. This stems from the use of simple yet well-thought-out ways of organizing and representing source code.

Basic Constructs

Programs written in Lisp variants are characterized by simple syntactic rules. Rather than discussing them theoretically, we will begin with a practical example that we will refer back to in order to learn the basic mechanisms governing the translation of source code into a form understandable by a computer. By tracing what the compiler does, we will better understand the constructs of the language.

Here is our base example:

(print "Hello, Lisp!")
(print "Hello, Lisp!")

Not particularly difficult. Is it?

Syntax

Lisp looks like oatmeal with toenail clippings mixed in.

– Larry Wall

Let us start with syntax. The first thing that catches the eye when we see programs written in Lisp dialects is the placement of nearly every compound construct within parentheses. In other programming languages, parentheses serve to group selected syntactic elements, e.g., arguments when calling or defining a function, sets of conditions, or operations on values. In Lisps, parentheses are the fundamental lexical element used to give shape to the entire program and to every expression.

In Lisp-family languages we construct expressions using the so-called Polish notation (PN), also known as prefix notation. It consists of placing the operator (function name) first, followed by the operands (call arguments). Parentheses are used to mark the beginnings and ends of expressions.

Prefix notation differs from the infix notation popular in many programming languages, but not enough to make it very difficult to recognize the individual parts of expressions. Our sample program can be expressed in Ruby as follows:

print "Hello, Lisp!"
print "Hello, Lisp!"

And in C in the following way:

#include <stdio.h>

int main(int argc, char *argv[]) {
    printf("Hello, Lisp!");
    return 0;
}
#include &lt;stdio.h&gt; int main(int argc, char *argv[]) { printf(&#34;Hello, Lisp!&#34;); return 0; }

The differences between the discussed notations can be well illustrated with mathematical operations. Let us look at two simple computations:

2 + 2 * 3
2 + 2 * 3

And the notation conforming to Clojure syntax:

(+ 2 (* 2 3))
(+ 2 (* 2 3))

We can notice that in the second example the addition operator and its operands are enclosed in parentheses, and the name of the operation is always in the first position. An advantage of this notation is that there is no need to remember operator precedence.

Furthermore, in Polish notation, to represent more complex constructs in which we need to express containment relations, we do not need additional grouping markers (such as, e.g., curly braces or indentation) or separators (e.g., semicolons). Syntactic processing of expressions in prefix notation is simpler and faster because, owing to the more clearly defined structure, it requires fewer rules.

Reader

Let us return to our program:

(print "Hello, Lisp!")
(print &#34;Hello, Lisp!&#34;)

The first stage of transforming its text into executable form is reading it into memory. In Lisps, the component responsible for this is called the reader. Its task is to open a file located on disk or an input stream associated with the user’s terminal and to subject the text being read to processing, in which we can distinguish two main phases:

  1. detecting known lexical constructs in the text;

  2. extracting grammatically correct expressions from the detected constructs and representing them as internal, in-memory structures.

Lexical analysis

The first phase of reading program sources into memory is lexical analysis. It consists of:

  • cleaning the input of unnecessary symbols;

  • recognizing in the character stream sequences matching the lexical units defined in the language lexicon, in this context called tokens;

  • extracting from the text fragments that have syntactic significance (so-called lexemes), which are a kind of instances of the detected tokens.

The result of lexical analysis applied to our example will be a stream of lexemes – that is, extracted fragments of the program text that have syntactic significance:

Lexeme Token name
( list literal (opening)
print symbol
"Hello, Lisp!" string literal
) list literal (closing)

During the described process (called tokenization) the extracted lexemes may optionally be annotated with information about their corresponding tokens, so that this task does not need to be repeated during subsequent analyses.

Because of the specific syntax of Clojure (similarly to other Lisp dialects), lexical analysis is greatly simplified and often amounts to passing control to the parser as soon as a lexeme is found.

Syntactic analysis

The second phase of processing source code into an in-memory form is syntactic analysis, also called parsing. It encompasses:

  • recognizing syntactic constructs in the stream of lexemes by comparing their kinds and positions against the grammatical rules of the language;

  • extracting expressions – that is, grammatical constructs whose values can be computed in later compilation stages;

  • producing in-memory representations of the found expressions using appropriate data structures;

  • placing the resulting objects in the abstract syntax tree (AST).

The result of syntactic analysis is the representation of the program’s source code in memory accessible to the compiler.

Reader forms

Reader forms is a term used in Clojure to describe lexical units used to build grammatically correct syntactic constructs.

The table below lists the basic reader forms. The first column contains the token name, the second column shows example lexemes, and the last column indicates the data type of the in-memory object that will represent the lexemes in the abstract syntax tree, provided grammatical requirements are met.

Token name Example lexemes Data type
symbol one
space/two
Symbol
nil literal nil nil
keyword literal :one
::two
:space/x
::space/y
Keyword
string literal "one two" java.lang.String
list literal (1 2 3) PersistentList
vector literal [1 2 3] PersistentVector
map literal {:a 1 :b 2}
::{:a 1 :b2}
:space{:a 1 :b 2}
PersistentArrayMap
PersistentHashMap
boolean literal true, false java.lang.Boolean
integer literal 1
0xff
017
2r1101
java.lang.Long
rational literal 1/2 Ratio
big number literal 1.2M
1N
java.math.BigDecimal
Bigint
float literal -2.7e-4 java.lang.Double

Reader macros

In Clojure, some syntactic constructs are implemented as so-called reader macros. These are also subroutines responsible for processing lexical units, but implemented somewhat differently. Instead of being hardcoded into the reader’s rule set, they reside in a special read table, where specific tokens are assigned to the subroutines responsible for their analysis.

In Clojure, the programmer cannot modify the read table, and there are no indications that such a capability is planned for the future. The motivation is to maintain a unified syntax across projects of diverse origin. However, we can use so-called tagged literals, which resemble reader macros, although they are subject to certain syntactic constraints.

The following table presents reader macros:

Token name Example lexemes Data type
quoting (quote) 'one
'(one two)
varies
syntax quoting
(syntax-quote)
`one
`(two three)
varies
syntax
unquoting

(syntax unquote)
~quote varies
syntax unquoting
with splicing

(syntax unquote-splicing)
~@(list 1 2) varies
metadata map
(metadata map)
^{:doc "Description"} PersistentArrayMap
PersistentHashMap
metadata key
(metadata key)
^:dynamic true PersistentArrayMap
PersistentHashMap
metadata tag
(metadata tag)
^Integer x PersistentArrayMap
PersistentHashMap
comment
(comment)
; comment none
character literal
(character literal)
\a, \b, \c, \newline java.lang.Character
dereference expression
(dereference expression)
@x java.lang.Class (with IFn)
Java member-access
operator

(Java member-access operator)
(. " a " trim)
(.trim " a ")
varies
Java member-access
operators

(Java member-access operators)
(.. "a" trim length) varies
dispatch macro (dispatch macro) # varies

The last entry in the table is the so-called dispatch macro. This is a general term for a subgroup of reader macros whose tokens all begin with the hash symbol (#). When the reader encounters this character, it passes control over further analysis of the construct to a separate macro table. Below is a list of tokens handled using the dispatch macro:

Token name Example lexemes Data type
Var quoting
(var-quote)
#'x Var
set literal
(set literal)
#{1 2 3} PersistentHashSet
regular expression
(regular expression)
#"one.tw[oO]" java.util.regex.Pattern
anonymous function literal
(anonymous function literal)
#(pr) java.lang.Class (with IFn)
anonymous function argument
(anonymous function argument)
#(pr %)
#(pr %1 %2)
PersistentTreeMap
ignore next form
(ignore next form)
#_ one two none
Java constructor call
(Java constructor call)
#name.type[:x]
#name.record{:x 1}
varies
reader conditional
(reader conditional)
$?(:clj "Clojure"
:cljs "ClojureScript")
varies
tagged literal
(tagged literal)
#symbol argument varies
inst tagged literal
(inst tagged literal)
#inst "2018-11-12" java.util.Date
UUID tagged literal
(UUID tagged literal)
#uuid "88b05082-392d-4e0a-89c9-cf62ec375c43" java.util.UUID

Note that some lexical units will not be represented by objects of a predetermined data type (marked “varies” in the type column). For example, quoting itself is not associated with producing a particular structure, but influences how a given construct is further parsed. We will also find tokens that do not cause any value to be generated (“none” in the type column), because they selectively exclude certain elements of the program text from the syntactic analysis process.

Also noteworthy is the type java.lang.Class with the annotation “with IFn”. This notation indicates that in Clojure, function objects are expressed by anonymous Java classes equipped with the IFn interface. This is related to the characteristics of functions in Java – they do not have properties that would allow using them directly.

S-expressions

At the grammatical level, every element of source code in Lisp is a symbolically written expression. The textual representations of expressions – those we see in an editor – are called symbolic expressions, abbreviated as S-expressions (sexprs, sexps).

The structure of S-expressions somewhat resembles XML or JSON – that is, we are dealing with a notation that allows expressing nested and ordered sets of values, though in a somewhat simpler way than the aforementioned formats.

An S-expression in Lisp can be defined as a kind of notation in which:

  • every element is an expression:
    • non-compound (called an atom) or
    • composed of S-expressions enclosed in parentheses and separated by a separator.
S-expression examples in Lisp
print                     ; atomic S-expression (not a list of S-expressions)
"Hello, Lisp!"            ; atomic S-expression (not a list of S-expressions)
(print "Hello, Lisp!")    ; list S-expression
()                        ; both list and atomic S-expression
(print (+ 2 2))           ; S-expression composed of nested S-expressions
print ; atomic S-expression (not a list of S-expressions) &#34;Hello, Lisp!&#34; ; atomic S-expression (not a list of S-expressions) (print &#34;Hello, Lisp!&#34;) ; list S-expression () ; both list and atomic S-expression (print (+ 2 2)) ; S-expression composed of nested S-expressions

An S-expression in Clojure will be defined in a somewhat richer way, because we have additional collection literals there. It will be a kind of notation in which:

  • every element is an expression:
    • non-compound (called an atom) or
    • composed of separated S-expressions:
      • in pairs, enclosed in curly braces – {...};
      • individually, enclosed in:
        • round parentheses – (...);
        • curly braces with a hash symbol – #{...};
        • square brackets – [...].

In the case of expressions in curly braces, the first elements of pairs must be unique values within the entire expression, and in the case of expressions in curly braces preceded by a hash symbol, each value must be unique. This check is performed already during syntactic analysis, and when a given element requires prior computation, during evaluation.

S-expression examples in Clojure
:one                      ; atomic S-expression
1                         ; atomic S-expression
print                     ; atomic S-expression
"Hello, Lisp!"            ; atomic S-expression
(print "Hello, Lisp!")    ; list S-expression
[1 2 3]                   ; vector S-expression
#{1 2 3}                  ; set S-expression
{:one 1 :two 2 :three 3} ; map S-expression
(str [1 2 3])             ; list and vector S-expressions
:one ; atomic S-expression 1 ; atomic S-expression print ; atomic S-expression &#34;Hello, Lisp!&#34; ; atomic S-expression (print &#34;Hello, Lisp!&#34;) ; list S-expression [1 2 3] ; vector S-expression #{1 2 3} ; set S-expression {:one 1 :two 2 :three 3} ; map S-expression (str [1 2 3]) ; list and vector S-expressions

The recursive definition of an S-expression may seem difficult to understand, so we will aid ourselves with our one-line program and perform a manual categorization of the elements present in it.

In the notation (print "Hello, Lisp!"):

  • (...) is an S-expression, because it is a symbolically written list of S-expressions;
  • print is an S-expression, because it is an atom;
  • "Hello, Lisp!" is an S-expression, because it is an atom.

Graphically, this set can be presented as follows:

S-expression

Unlike other Lisps, compound S-expressions in Clojure are built not only from lists marked by round parentheses, but also using additional markers that correspond to certain kinds of collections. Depending on the notation used, a data structure representing the appropriate kind of S-expression will be created in memory:

Notation Literal S-expression Structure Data type
(a...z) list list list PersistentList
[a...z] vector vector vector PersistentVector
{a b ... x y} map map map PersistentArrayMap
PersistentHashMap
#{a...z} set set set PersistentHashSet

Atoms

When discussing symbolic expressions, we mentioned a specific class of them called atoms. An atom is an element of Lisp syntax that is not compound (is not: a list, a set, a vector, or a map). The exceptions are the empty list, the empty set, the empty vector, and the empty map, which are both compound expressions and atoms.

However, one more important condition must be added to the above definition, a condition that determines whether a symbolic notation can be considered a Lisp atom. Let us recall our program:

(print "Hello, Lisp!")
(print &#34;Hello, Lisp!&#34;)

In the previous examples, we could see that print and Hello, Lisp! are atoms, but would any arbitrary set of characters that is not a pair of parentheses with content qualify? No. An atom must be a valid syntactic construct on the basis of which the reader will be able to decide what object to place in the AST. There is a certain rigor here. In this particular case, the text print will be stored as a symbol, and Hello, Lisp! as a string, because they satisfy the syntactic requirements for representing specific data structures.

Admittedly, the syntactic rules are liberal enough that most randomly typed words or even single characters will be recognized as symbols (and thus atoms), but placing a parenthesis in a symbolic label or starting it with a digit would be serious violations, and the reader would stop cooperating with us.

It is worth noting that the concept of an atom as a class of syntactic expressions is not widely used among Clojure programmers. This is probably because in Clojure there exists a reference data type called Atom which helps perform concurrent data operations.

List S-expressions

The most commonly encountered class of S-expressions are list S-expressions. It is thanks to them that programs written in Lisp dialects consist of a large number of parentheses.

A list S-expression in Clojure should be a list of elements (other S-expressions) separated by spaces, commas, or both. The beginning and end of a list S-expression should be marked by an opening and closing round parenthesis.

List expressions with symbols placed in their first positions serve to invoke subroutines (functions, macros, or special constructs) identified by them. The remaining elements of the list then express arguments passed to the invocation. In the case of functions, argument values will be computed before being applied.

If no elements are provided, a list S-expression will produce an empty list.

List S-expression examples
 1;; calling the + function
 2
 3(+ 1 2 3)
 4;=> 6
 5
 6;; calling the defn macro
 7;; used to define a named function
 8
 9(defn greet [] "Hello!")
10;=> #'user/greet
11
12;; calling the defined greet function
13
14(greet)
15;=> "Hello!"
16
17;; special form def
18;; used to define a global variable
19
20(def x 2)
21;=> #'user/x
22
23;; doubling the value of the global variable
24;; identified by the symbol x
25
26(+ x x)
27;=> 4
28
29;; empty list
30
31()
32;=> ()
;; calling the + function (+ 1 2 3) ;=&gt; 6 ;; calling the defn macro ;; used to define a named function (defn greet [] &#34;Hello!&#34;) ;=&gt; #&#39;user/greet ;; calling the defined greet function (greet) ;=&gt; &#34;Hello!&#34; ;; special form def ;; used to define a global variable (def x 2) ;=&gt; #&#39;user/x ;; doubling the value of the global variable ;; identified by the symbol x (+ x x) ;=&gt; 4 ;; empty list () ;=&gt; ()

Older Lisp dialects supported a somewhat different form of list S-expressions. Inside a pair of parentheses one did not place a list of all elements, but only one cell of the list, divided into a left and right value separated by a dot, with the end of the list marked by the symbol nil, e.g.:

(+ . (1 . (2 . nil)))
(+ . (1 . (2 . nil)))

In Clojure, this kind of notation is not supported, although there are data structures that allow performing operations on individual cells and creating so-called sequences.

Vector S-expressions

Vector literals create so-called vector S-expressions, but unlike list expressions, elements placed in their first positions have no special significance. The result of using a vector S-expression in its basic form will be a data structure called a vector, and each of its elements will become a component of it after its value has been computed.

Using vector S-expressions we can:

  • create the aforementioned vectors and use them for application logic,

  • specify argument lists of defined functions and macros,

  • express bindings of symbols to values in appropriate special constructs (including let and binding),

  • perform so-called positional destructuring of compound structures with a sequential access interface.

Vector S-expression examples
;; literal vector

[1 2 3 4]
;=> [1 2 3 4]

;; literal vector

[(+ 1 1) 2 3 4]
;=> [1 2 3 4]

;; bindings vector in a let form
;; creates a lexical binding of symbol a to value 1

(let [a 1] a)
;=> 1

;; vector binding form in a let form (destructuring)
;; inside the bindings vector of the let form
;; binds each symbol from the S-expression [a b]
;; to an initializing value from the S-expression [1 2]
;; according to position

(let [[a b] [1 2]]
  (+ a b))
;=> 3

;; argument list in a named function definition

(defn add
  [a b]
  (+ a b))
(add 2 2)
;=> 4

;; empty vector

[]
;=> []
;; literal vector [1 2 3 4] ;=&gt; [1 2 3 4] ;; literal vector [(+ 1 1) 2 3 4] ;=&gt; [1 2 3 4] ;; bindings vector in a let form ;; creates a lexical binding of symbol a to value 1 (let [a 1] a) ;=&gt; 1 ;; vector binding form in a let form (destructuring) ;; inside the bindings vector of the let form ;; binds each symbol from the S-expression [a b] ;; to an initializing value from the S-expression [1 2] ;; according to position (let [[a b] [1 2]] (+ a b)) ;=&gt; 3 ;; argument list in a named function definition (defn add [a b] (+ a b)) (add 2 2) ;=&gt; 4 ;; empty vector [] ;=&gt; []

Map S-expressions

Using the map literal we can construct map S-expressions. In their basic form they allow expressing an associative data structure called a map, which consists of indexed key–value pairs. Values should be separated by spaces, commas, or both. The first element of each pair is called the key, and the second the value.

Using map S-expressions we can:

  • create the aforementioned maps and use them for application logic,

  • specify named argument lists of defined functions and macros,

  • construct so-called metadata maps that allow enriching certain constructs with metadata that can describe them or control their properties;

  • perform associative destructuring of compound structures with an associative access interface.

Map S-expression examples
;; literal map

{"a" 1 "b" 2}
;=> {"a" 1 "b" 2}

;; map expressing named arguments
;; of a defined function
;; keys a and b must be strings

(defn add [& {:strs [a b]}]
  (+ a b))

;; calling a function with named arguments

(add "b" 1, "a" 3)
;=> 4

;; metadata map of a defined function
;; with a documentation string (key :doc)

(defn add
  {:doc "This function adds two numbers."}
  [a b]
  (+ a b))
;=> #'user/add

;; calling the documentation string for add
(doc add)
;=>> -------------------------
;=>> user/add
;=>> ([a b])
;=>>   This function adds two numbers.
;=> nil

;; map binding form in a let form (destructuring)
;; inside the bindings vector of the let form
;; binds symbols a and b
;; to values from a map indexed by "a" and "b"

(let [{:strs [a b]} {"b" 1 "a" 3}]
  (+ a b))
;=> 4

;; empty map

{}
;=> {}
;; literal map {&#34;a&#34; 1 &#34;b&#34; 2} ;=&gt; {&#34;a&#34; 1 &#34;b&#34; 2} ;; map expressing named arguments ;; of a defined function ;; keys a and b must be strings (defn add [&amp; {:strs [a b]}] (+ a b)) ;; calling a function with named arguments (add &#34;b&#34; 1, &#34;a&#34; 3) ;=&gt; 4 ;; metadata map of a defined function ;; with a documentation string (key :doc) (defn add {:doc &#34;This function adds two numbers.&#34;} [a b] (+ a b)) ;=&gt; #&#39;user/add ;; calling the documentation string for add (doc add) ;=&gt;&gt; ------------------------- ;=&gt;&gt; user/add ;=&gt;&gt; ([a b]) ;=&gt;&gt; This function adds two numbers. ;=&gt; nil ;; map binding form in a let form (destructuring) ;; inside the bindings vector of the let form ;; binds symbols a and b ;; to values from a map indexed by &#34;a&#34; and &#34;b&#34; (let [{:strs [a b]} {&#34;b&#34; 1 &#34;a&#34; 3}] (+ a b)) ;=&gt; 4 ;; empty map {} ;=&gt; {}

Set S-expressions

The set literal enables writing set S-expressions. Thanks to them, one can express sets in a clear and straightforward way – that is, structures in which each element appears only once.

The set literal consists of curly braces preceded by a hash symbol, inside which unique-within-the-set values are placed. Elements of a set S-expression should be separated by a space, a comma, or both.

If an element of a set expression is not a constant value, it will be computed before the object representing the set is created.

Set S-expression examples
#{1 2 3 4}
;=> #{1 2 3 4}

#{1 (+ 1 1) 3 4}
;=> #{1 2 3 4}

;; empty set

#{}
;=> #{}
#{1 2 3 4} ;=&gt; #{1 2 3 4} #{1 (+ 1 1) 3 4} ;=&gt; #{1 2 3 4} ;; empty set #{} ;=&gt; #{}

Abstract syntax tree

We know that the result of syntactic analysis is a form of source code in the abstract syntax tree. Let us see what it will look like after reading our program:

(print "Hello, Lisp!")
(print &#34;Hello, Lisp!&#34;)

The opening parenthesis at the beginning will cause the reader’s mechanisms to treat the lexical construct as a list literal. Taking the content into account, it will form a grammatical construct: a list S-expression. During the parsing phase it will be transformed into a list object – that is, a structure used to store an ordered set of data. These data will be the representations of the two atomic S-expressions placed within the parentheses:

Converting text to AST

The syntax tree is not an element specific to Lisps. Compilers and interpreters of other programming languages also use it. However, in their case the AST is partially or entirely inaccessible to the programmer and built using internal data structures that are not supported by the language. Operating on the syntax tree from within the program is either impossible or limited to using special macro mini-languages. In Lisps, things are different, as we will learn later.

Returning to our program… The first element of the list S-expression is the atom print. It will be recognized as another known syntactic element: a symbol. The rules of the language state that it must be a word composed of alphanumeric characters that does not begin with a digit or with a special character indicating data of a different kind. In the in-memory list structure, an object of type Symbol will be placed at the first position. This data type resembles keywords or text labels known from other languages – it will be discussed more broadly in a later section.

The last component of the symbolic expression we entered is the string literal Hello, Lisp!, which can be detected by the quotation marks surrounding the text. It too is an atom and will be treated as a representation of a string, which will be placed at the end of the list residing in the syntax tree.

To summarize, in the AST the following will be created:

  • a data structure (list) consisting of:
    • a datum of type symbol,
    • a datum of type string.

Let us notice that the graphical representation of source code in the abstract syntax tree and the earlier-shown way of organizing S-expressions are similar to each other. Coincidence?

Homoiconicity

In Lisps, the in-memory AST objects are represented using the same data structures that we can use in programs. Moreover, their arrangement in the syntax tree is the same as the layout of S-expressions in the textual representation. In other words: the abstract syntax tree and the program text are isomorphic. If necessary, we can read the AST and obtain readable source code in the form of symbolic expressions.

The trait described here is called homoiconicity and opens the possibility of transforming S-expressions reflected in the AST using the system of syntactic macros, to which a separate chapter will be devoted.

Semantics

The next important stage of transforming a program into executable form is semantic analysis – that is, the process of recognizing the meaning-bearing constructs of the language. In Lisp dialects, this is one of the first tasks of the component called the evaluator. It consists of reading the objects of the abstract syntax tree representing expressions, checking whether they are semantically correct, and computing the value of each one by running the appropriate subroutines. In the case of Clojure, part of this process will be carried out during compilation, and part at program runtime.

Let us recall our example once more:

(print "Hello, Lisp!")
(print &#34;Hello, Lisp!&#34;)

After reading the source code into memory, data representing it were placed in the AST, and during program execution, each datum will be evaluated in order to ultimately obtain the value of the outermost S-expression.

In the given example, the first element of the list is the symbol print, which identifies a built-in Clojure function used to display text on the screen. By examining the current namespace, the compiler will find the appropriate mapping of the symbol to a function object and determine that it is dealing with a construct that is a function call. It will also identify the subroutine that needs to be invoked to perform the operation.

Functions can accept arguments, and the given symbolic expression also contains (apart from the operation name) an argument intended for its call. Because in Lisps we are dealing with pass-by-value, the process of evaluating the string Hello, Lisp! will be performed first. A string is a constant value, so no further computation will be required, and it is this value that will be passed to the call.

Running the print function subroutine will produce the intended side effect of displaying the following on screen:

Hello, Lisp!

Additionally, the function will return a value that, depending on how the program was launched, will be displayed (in the case of an interactive console) or remain unhandled (in other cases).

Forms

The abstract syntax tree in Lisps is data organized as nested lists. Each such datum is a Lisp form, called simply a form – that is, an in-memory representation of a syntactically valid S-expression intended for evaluation.

The evaluator, analyzing AST elements, attempts to compute their values, paying attention to data types and the contexts in which they appear. Several main decision paths of the evaluating subroutine are possible, corresponding to different kinds of forms encountered.

By examining only the data structures in the AST or the S-expressions in the source code text, we cannot precisely determine what specific form we will be dealing with. This can only be established during evaluation. For example, the symbol x placed at the first position of the list S-expression (x 1) may, after evaluation, turn into a function object – and then it will be a function-call form for x – but it may also turn out to be, say, a map, and then it will be a map lookup form where the value 1 is treated as a key.

Before we proceed to discussing the main kinds of forms, let us name those that will be recognized in our program:

Forms in the example program
(print "Hello, Lisp!")
(print &#34;Hello, Lisp!&#34;)
  • (...) – compound form,
  • print – symbol form,
  • "Hello, Lisp!" – normal form,
  • (print "Hello, Lisp!") – function-call form:
    • after detecting that print is bound to a function;
    • after detecting the function’s argument form;
    • after computing the argument to a normal form.

Constant and normal forms

A source code expression representing a constant value that does not require computation will be recognized by the evaluator as a so-called constant form and passed unchanged to the calling, parent construct.

The most direct kind of constant form is the so-called normal form – a source code datum that by its nature does not require evaluation, because it is already represented in the syntax tree by its own value. Examples include numeric literals and string literals.

Constant form examples
"Hello, Lisp!"    ; string
\a                ; character
()                ; empty list
123               ; integer
'lalala           ; quoted symbol
&#34;Hello, Lisp!&#34; ; string \a ; character () ; empty list 123 ; integer &#39;lalala ; quoted symbol

An interesting kind of constant form is the result of applying the quote form – a special syntactic form that allows disabling evaluation of the S-expressions passed to it and treating the source code data as constant structures. The quote construct and the quoting mechanism will be discussed in more detail further on.

Symbol forms

A symbol will be treated by the evaluator as a symbol form, and it will attempt to find the object bound to it, which – depending on the context – may be assigned to that symbol’s name in various places, e.g., in lexical bindings or namespaces (and global variables). The obtained value will be substituted in place of the symbol’s occurrence.

Symbol form examples
print
+
print +

See also:

Compound forms

When the evaluator encounters a non-empty collection, it will recognize it as a so-called compound form.

In the case of a map form, a vector form, or a set form, each element will be evaluated, and a collection will be returned in which every element has been computed to a constant form.

Examples of common collection forms
 [1 2 3]        ; vector form
#{1 2 3}        ; set form
 {"a" 1 "b" 2}  ; map form
'(1 2 3)        ; list form (literal list)
[1 2 3] ; vector form #{1 2 3} ; set form {&#34;a&#34; 1 &#34;b&#34; 2} ; map form &#39;(1 2 3) ; list form (literal list)

In the case of a list or a sequence of Cons objects, the value of the first element will be computed, and a decision will be made regarding the further course of action depending on the detected form.

Lookup forms

If the first element of a list form being evaluated is a vector, a set, or a map, then a vector lookup form, a set lookup form, or a map lookup form will be recognized, respectively. The value of the next element in the list will be passed as an argument to the subroutine responsible for finding elements by given indices. If there are more arguments, an error will be generated.

Collection lookup form examples
([1 2 3]         0)  ;=> 1
(#{1 2 3}        2)  ;=> 2
({"a" 1 "b" 2} "b")  ;=> 2
([1 2 3] 0) ;=&gt; 1 (#{1 2 3} 2) ;=&gt; 2 ({&#34;a&#34; 1 &#34;b&#34; 2} &#34;b&#34;) ;=&gt; 2

If the first element of a list form being evaluated is a keyword or a literal symbol, then a keyword lookup form or a symbol lookup form will be recognized, and the value of the next element in the list will be passed as an argument to a subroutine that uses the given keyword or symbol as an index to find the value. If there are more arguments, an error will be generated.

Keyword and symbol lookup form examples
(:a {:a 1 :b 2})       ;=> 1
(:a #{:a :b :c})       ;=> :a
('a {'a 1 'b 2 :c 3})  ;=> 1
(:a {:a 1 :b 2}) ;=&gt; 1 (:a #{:a :b :c}) ;=&gt; :a (&#39;a {&#39;a 1 &#39;b 2 :c 3}) ;=&gt; 1
Function-call forms

If the first element of a list form being evaluated is a function object, the expression will be treated as a function-call form, and the computed values of the remaining elements of the list will be passed as arguments to the subroutine associated with that object. This also applies to Java methods.

Function-call form examples
(print "Hello, Lisp!")  ;=> nil =>> Hello, Lisp!
(+ 2 2)                 ;=> 4
(.toLowerCase "A")      ;=> a
(print &#34;Hello, Lisp!&#34;) ;=&gt; nil =&gt;&gt; Hello, Lisp! (+ 2 2) ;=&gt; 4 (.toLowerCase &#34;A&#34;) ;=&gt; a
Special forms

If the first element of a list form being evaluated is one of the so-called syntax forms, the built-in handler subroutine for the associated special form will be launched, and the remaining elements of the list will become its arguments (after their prior evaluation or without evaluation).

Special form usage examples
(def   x 1)                           ; defining a global variable
(fn    [x] (inc x))                   ; creating a function
(def   f (fn [x] (inc x)))           ; defining a named function
(let   [a 1] a)                       ; creating lexical bindings
(quote (1 2 3))                       ; quoting
(. System (getProperty "user.home"))  ; accessing Java classes
(def x 1) ; defining a global variable (fn [x] (inc x)) ; creating a function (def f (fn [x] (inc x))) ; defining a named function (let [a 1] a) ; creating lexical bindings (quote (1 2 3)) ; quoting (. System (getProperty &#34;user.home&#34;)) ; accessing Java classes

Special forms are – as the name itself indicates – forms characterized by special evaluation rules. In essence, a special form is a kind of function built into the language, whose – similarly to macros – arguments are not all immediately evaluated.

Thanks to special forms we can, for example, define global variables or functions, and also create bindings of symbols to values in certain areas of a program.

Recursive evaluation of forms

When a datum obtained in the course of the evaluator’s work turns out not to be a constant value, but another form requiring evaluation, it will be recursively computed until a constant form is obtained. For example, a symbol form may emit a Var reference object, which in turn will point to the subroutine of the function being called.

Invalid forms

If it turns out that a given AST element is not a valid form – that is, one whose value cannot be computed (reduced to a constant form) – an error message will be generated and the program will terminate abnormally.

Examples of S-expressions that are not forms
1(1 2 3)       ; the number 1 cannot be cast to a function type
2(())          ; an empty list cannot be cast to a function type
3(+ /)         ; the function / cannot be cast to an integer type
4(no_such)     ; no construct identified by the symbol no_such
5no_such       ; no construct identified by the symbol no_such
(1 2 3) ; the number 1 cannot be cast to a function type (()) ; an empty list cannot be cast to a function type (+ /) ; the function / cannot be cast to an integer type (no_such) ; no construct identified by the symbol no_such no_such ; no construct identified by the symbol no_such

The names of some forms are conventional – that is, they are based on the assumption that certain S-expressions can be computed to data of specific types. For example, the notation:

(add 1 2 3)
(add 1 2 3)

we will conventionally call a function-call form, even though in reality we are not certain whether the symbol form add will be recognized in the namespace as a global variable, and whether the reference object found there will contain a reference to a function.

Binding forms

A binding form is a construct in which a value is assigned to a symbolic identifier in order to create a binding. At the syntactic level, these forms will contain unquoted symbols, but they will not be treated as symbol forms whose values need to be resolved; rather, they serve as expressions that assign in-memory objects to symbolic names.

We will recognize a binding form in expressions representing certain special forms, where we will encounter:

  • a pair composed of an unquoted symbol and another form (representing the value that, once computed, is to be assigned);

  • an unquoted symbol appearing on its own, e.g., in a function’s parameter vector, where the value will be dynamically assigned when arguments are passed during the function’s invocation.

Symbol binding forms are components of certain compound forms and [special forms][formy specjalne]. We will find them, for example:

Additionally, binding forms may also appear in so-called structural bindings – that is, abstract bindings in which compound data structures are destructured and symbols are assigned values at specified positions or identified by given keys. We will then encounter a vector binding form or a map binding form, in which instead of a single unquoted symbol on the left side, a vector or map S-expression appears.

Some other constructs that use binding forms can also be intuitively referred to as binding forms. These are expressions whose primary purpose is to create bindings. For example, the special form let will also be called a let binding form, even though in reality it uses potentially many basic binding forms grouped in a bindings vector.

Binding form examples
(def x 5)       ; symbol x in a binding form
(fn [x] nil)    ; symbol x in a binding form
(let [x 2])     ; symbol x in a binding form

(defn some-fn   ; symbol some-fn in a binding form (function name)
  [x]           ; symbol x in a binding form (parameter name)
  nil)
(def x 5) ; symbol x in a binding form (fn [x] nil) ; symbol x in a binding form (let [x 2]) ; symbol x in a binding form (defn some-fn ; symbol some-fn in a binding form (function name) [x] ; symbol x in a binding form (parameter name) nil)

It is worth noting that in the case of function arguments, the symbols identifying their parameters will be dynamically assigned to the values of arguments passed at the moments of function invocation.

See also:

Quoting

An important special construct in Clojure and other Lisps is the special form quote, which allows disabling evaluation of the S-expression given as its argument. A syntactic construct written this way will be read and syntactically analyzed, but the phase of semantic analysis will be skipped for it. Appropriate data structures will be placed in the AST, but they will be marked as constant forms. Instead of computing their values, the evaluator will simply return the data structures found in the syntax tree.

In Lisp dialects, using the quote form allows creating literal variants of structures that in their unquoted forms would be used to represent program source code and/or would be evaluated.

Examples of using the quote form
1(quote one)        ; literal symbol
2(quote (1 2 3))    ; literal list
3(quote [a b])      ; literal vector
4(quote {a 1 b 2})  ; literal map
5(quote #{1 2 3})   ; literal set
(quote one) ; literal symbol (quote (1 2 3)) ; literal list (quote [a b]) ; literal vector (quote {a 1 b 2}) ; literal map (quote #{1 2 3}) ; literal set

The above can also be written using syntactic sugar:

1'one
2'(1 2 3)
3'[a b]
4'{a 1 b 2}
5'#{a b}
&#39;one &#39;(1 2 3) &#39;[a b] &#39;{a 1 b 2} &#39;#{a b}

The same data structures could be produced without using quoting, by using the appropriate built-in functions of the language:

1(symbol "one")                            ; symbol
2(list 1 2 3)                              ; list
3(vector (symbol "a") (symbol "b"))        ; vector
4(hash-map (symbol "a") 1 (symbol "b" 2}) ; map
5(hash-set 1 2 3)                          ; set
(symbol &#34;one&#34;) ; symbol (list 1 2 3) ; list (vector (symbol &#34;a&#34;) (symbol &#34;b&#34;)) ; vector (hash-map (symbol &#34;a&#34;) 1 (symbol &#34;b&#34; 2}) ; map (hash-set 1 2 3) ; set

Quoting is recursive – that is, every S-expression nested within a quoted one will also be quoted.

Recursive quoting examples
1(quote (a b c))        ; list with literal symbols
2(quote (+ 2 (* 2 3)))  ; list with literal symbols and numbers
3(quote [one 2 3])      ; vector with a literal symbol and numbers
4(quote {a 1 b 2})      ; map with literal symbols and numbers
5(quote #{a b c})       ; set with literal symbols
(quote (a b c)) ; list with literal symbols (quote (+ 2 (* 2 3))) ; list with literal symbols and numbers (quote [one 2 3]) ; vector with a literal symbol and numbers (quote {a 1 b 2}) ; map with literal symbols and numbers (quote #{a b c}) ; set with literal symbols

It is worth noting that the terms “literal map”, “literal set”, and “literal vector” may in certain contexts be considered overly zealous labels. Each element of symbolically expressed maps, sets, or vectors will be eagerly evaluated by the evaluator, and their evaluation ends there. They remain the same data structures and do not carry the same syntactic weight as lists.

Augmenting the names of the above-mentioned structures with the qualifier “literal” will make sense when we want to emphasize that the contents of their elements will not be computed. An example would be a situation in which the elements of a vector S-expression are symbols pointing to certain values. When we quote such a construct, the values of elements will not be computed during vector evaluation. If we want to emphasize this fact, we can call the vector literal.

Literal vectors and vector forms
[1 2 3]         ; vector form (colloq. vector, rarely: literal vector)
[1 2 (inc 2)]   ; vector form (colloq. vector)
'[1 2 (inc 2)]  ; literal vector (colloq. vector)
'[1 2]          ; literal vector
[1 2 3] ; vector form (colloq. vector, rarely: literal vector) [1 2 (inc 2)] ; vector form (colloq. vector) &#39;[1 2 (inc 2)] ; literal vector (colloq. vector) &#39;[1 2] ; literal vector

The most sensible approach seems to be to describe maps, vectors, and sets as literal when they are expressed in a quoted form. In the case of unquoted forms (subject to evaluation), colloquial terms can be used.

Let us also try quoting applied to our template program:

'(print "Hello, Lisp!")
&#39;(print &#34;Hello, Lisp!&#34;)

The result of evaluating the above expression is a value which is a list containing a symbol and a string – that is, our original program, expressible as the text:

(print "Hello, Lisp!")

It is worth noting that when a constant form is quoted, its value after evaluation is still its own previous, constant value, as we can observe with the string Hello, Lisp!.

Identifiers

Identifiers are constructs that allow us to name identities (e.g., individual values or compound data structures placed in memory) so that we can later refer to them.

In Clojure there are identification constructs (symbols, discussed below) that carry special syntactic meaning and whose named objects are resolved automatically. There are also constructs (e.g., keywords) that serve to identify utility data, but their use is not mandatory.

Symbols

A symbol in Clojure is a data type whose instances serve to identify structures placed in memory. Thanks to symbols (and the appropriate handling of them by the reader mechanisms) we can give names to functions and their arguments, to the results of computations, and to other objects, and subsequently refer to them using human-readable identifiers.

In Clojure, symbol forms can be expressed in program text without any additional markers. A symbol’s name is at the same time its own identifier, although the symbol object placed in the AST will be subject to further evaluation by the evaluator (it will not be self-evaluating).

Syntactically speaking, every symbol has a name that must be a string beginning with a non-digit character and may contain alphanumeric characters as well as: *, +, !, -, _, and ?.

Examples of symbol usage
1func                ; func is a symbol form
2(func 1 2 3)        ; func is a symbol form in a function-call form
3(fn [x y] x)        ; symbols x and y are parameters of an anonymous function
4                    ; [x y] contains two symbol binding forms
5(.toLowerCase "A")  ; toLowerCase is a Java class method
6
7; ChunkedSeq is an inner class of PersistentVector
8(new clojure.lang.PersistentVector$ChunkedSeq [1 2 3 4 5] 0 3)
func ; func is a symbol form (func 1 2 3) ; func is a symbol form in a function-call form (fn [x y] x) ; symbols x and y are parameters of an anonymous function ; [x y] contains two symbol binding forms (.toLowerCase &#34;A&#34;) ; toLowerCase is a Java class method ; ChunkedSeq is an inner class of PersistentVector (new clojure.lang.PersistentVector$ChunkedSeq [1 2 3 4 5] 0 3)

In many Lisps a symbol is a reference type, meaning it independently identifies another object by storing a reference to its in-memory structure. In Clojure things are different – a symbol does not hold any reference, and the fact that we can use symbol forms to invoke functions or refer to constant values is owed to the appropriate treatment by the evaluator and the searching of additional structures (e.g., namespaces or lexical bindings areas).

The dot character (.) within a symbol’s textual name carries special significance, as it allows building identifiers that refer to class names of the host system, i.e., Java.

Namespaced symbols

A symbol may optionally contain an additional name specifying a so-called namespace – a special collection that serves to group identifiers in order to eliminate conflicts. More details about using this visibility-control mechanism can be found later in this episode.

The namespace name is a string subject to the same syntactic rules as the symbol name, and is expressed by placing it before the symbol name and separating it with a slash (/), for example:

1my/func        ; symbol func with namespace my
2(my/func 1 2)  ; symbol func with namespace my in a function-call form
my/func ; symbol func with namespace my (my/func 1 2) ; symbol func with namespace my in a function-call form

In the symbol object itself we will find no reference to the namespace other than a textual one. Here too, no reference is stored. We can specify a nonexistent namespace and it will not be an error – not until the evaluator begins evaluating the notation.

Literal symbols

We encounter symbols primarily in the abstract syntax tree, where they will be symbol forms representing identifiers of other objects. Additionally, we can use literal symbols (constant forms) in application logic. In that case they will serve, for example, as simple enumerated types, representing constant values belonging to fixed sets:

1(list (symbol "little") (symbol "much") (symbol "most"))
2(list 'little 'much 'most)
3'(little much most)
(list (symbol &#34;little&#34;) (symbol &#34;much&#34;) (symbol &#34;most&#34;)) (list &#39;little &#39;much &#39;most) &#39;(little much most)

The three notations above are equivalent. The first constructs a literal list whose elements are symbols created from strings using the built-in symbol function; the second also produces a list, but uses quoting to express literal symbols; the third makes use of recursive quoting of the entire list S-expression.

Constant forms of symbols may also include a namespace designation:

1(symbol "name" "space")
2'name/space
(symbol &#34;name&#34; &#34;space&#34;) &#39;name/space

In Lisps, literal symbols are sometimes used as indexing keys in associative structures (e.g., maps):

{ 'little 21, 'much 108, 'most 11 }
{ &#39;little 21, &#39;much 108, &#39;most 11 }

In Clojure, using symbols for this purpose is not recommended because they are not interned – two symbols with the same name will be represented by two different in-memory objects. In other words: in Clojure, symbol instances with the same names are not singletons.

If we used symbols to index large, frequently accessed structures, a great many temporary objects would be created, and the garbage collector would have to deal with them at the cost of valuable time. Another drawback is less efficient comparison (and hence searching), since symbol names would be compared rather than internal object identifiers.

See also:

Keywords

A keyword (colloquially called a key) is a data type that – similarly to symbols – serves to identify other objects, but in Clojure it carries no special syntactic meaning and keywords do not automatically identify other program constructs. Keywords are represented by objects of type clojure.lang.Keyword.

Unlike symbols, keywords are interned. Two keywords with the same name will be represented in memory by the same object (their singleton).

Keywords work well as simple enumerated types or as indices in associative data structures (e.g., maps). As for the built-in mechanisms of the Clojure language, we will encounter keywords in certain macros and in constructs that express named arguments bindings of functions.

From the syntactic point of view, every keyword has a name that must be a string beginning with a colon (:) and may contain alphanumeric characters as well as: *, +, !, -, _, and ?. In practice a slightly richer character set may be used (e.g., the dot is increasingly common for denoting internal hierarchies in some libraries), but this may change in the future, so it is worth using key names consistent with the language documentation.

Keywords are also functions. When we place a keyword in the first position of a list S-expression, a subprogram will be invoked that tries to look up an index in an associative or set collection given as an argument.

Like symbols, keywords may optionally include a namespace designation.

Examples of creating keywords
:a-key                              ; keyword
::a-key                             ; keyword with current namespace
:space/a-key                        ; keyword with namespace
::user/a-key                        ; keyword with namespace (existing)
(keyword "a-key")                   ; keyword from a string
(keyword "space" "a-key")           ; same but with namespace specified
{:a 1 :b 2}                         ; map with keyword indices
(defn x [& {:keys [color]}] color)  ; function with a named keyword argument
(x :color 123)                      ; calling function with keyword argument
(:a {:a 1 :b 2})                    ; keyword as a function searching a map
(:a #{:a :b :c :d})                 ; keyword as a function searching a set
:a-key ; keyword ::a-key ; keyword with current namespace :space/a-key ; keyword with namespace ::user/a-key ; keyword with namespace (existing) (keyword &#34;a-key&#34;) ; keyword from a string (keyword &#34;space&#34; &#34;a-key&#34;) ; same but with namespace specified {:a 1 :b 2} ; map with keyword indices (defn x [&amp; {:keys [color]}] color) ; function with a named keyword argument (x :color 123) ; calling function with keyword argument (:a {:a 1 :b 2}) ; keyword as a function searching a map (:a #{:a :b :c :d}) ; keyword as a function searching a set

See also:

Namespaces

In Clojure a construct called a namespace is used. In information technology this term denotes a mechanism for controlling the visibility of identifiers that allows their hierarchical grouping in order to avoid conflicts. Common examples of namespaces include file system directory structures, DNS records, and public IP addresses.

Namespaces in computer programming help separate sets of identifiers used in different components (e.g., programming libraries) or contexts, so that the exposed names (e.g., of modules, classes, variables, or functions) are unique. We can then use several libraries in a program that define functions with the same name. When referring to such a function, a so-called fully qualified name will be required – an identifier enriched with a namespace designation.

In Clojure, namespaces are implemented as globally visible maps, i.e., dictionaries storing key-value mappings. In their case the keys are symbols, and the values are global variables or Java classes. The data type used to represent namespace objects is clojure.lang.Namespace.

Since symbols placed in a program may optionally include a namespace designation, it is possible to use them to refer to identifiers from different namespaces. If a symbol does not contain a fully qualified name, then during its evaluation it is assumed that it identifies a construct defined in the current namespace, pointed to by the dynamic global variable clojure.core/*ns*:

;; symbol select from namespace clojure.set
;; identifies a function

(clojure.set/select odd? #{1 2 3 4})
;=> #{1 3}

;; symbol + from namespace clojure.core
;; imported into the current namespace (user)
;; identifies a function

(+ 2 2)
;=> 4
;; symbol select from namespace clojure.set ;; identifies a function (clojure.set/select odd? #{1 2 3 4}) ;=&gt; #{1 3} ;; symbol + from namespace clojure.core ;; imported into the current namespace (user) ;; identifies a function (+ 2 2) ;=&gt; 4

The first symbol in the above example has the name select and its namespace is specified by the string clojure.set. Since this is a function-call form, it will be treated as the identifier of a function whose subprogram needs to be executed. To find the function object, the namespace clojure.set will be searched and within it the key identified by the symbol select and the global variable assigned to it. Inside the reference object, a reference to the function subprogram will be found and invoked with the arguments given as the subsequent elements of the list S-expression (odd? and #{1 2 3 4}).

The second symbol in the example has no namespace specified. The evaluator will therefore read the contents of the dynamic global variable *ns*, which by default in the REPL console points to the namespace user. There it will find the binding of the symbol + to a Var object, which refers to the built-in addition function. The latter does not originally reside in the namespace user but was imported into it along with other mappings from the namespace clojure.core when the console was started.

See also:

Handling global state

When talking about Clojure, it is emphasized that there are no conventional variables and that data structures are immutable. Why, then, does the term “global variable” keep appearing?

Every programming language must handle changing global state – values that undergo change and are visible throughout the entire program under fixed names. Examples include: a character’s condition in a game, the arrangement of windows in a user interface, or the currently processed contents of a file being read.

To represent such data one might use conventional variables – fixed memory locations with given names whose contents are modified by appropriate subprograms (e.g., updating a score, reacting to user actions on interface elements, or reading data from a database). However, in such a model we cannot use multiple threads without applying additional isolation mechanisms. The memory slot of a variable representing an important parameter may be reorganized by one thread while another is still performing a critical operation with it. This means more work for the programmer, who instead of focusing on the application’s business logic must remember to safeguard the program against itself by introducing semaphores, locks, and similar constructs.

The opposite approach is to give up shared state entirely – a purely functional model in which no already-used memory area can ever be modified and the entire program consists of operations whose only input is their arguments and whose only output is their returned values. Handling graphical interfaces or tracking the activity of a logged-in user would in such a setup be cumbersome to express and inefficient. Imagine every user action triggering a cascade of function calls that each time recompute from scratch the values of all parameters constituting the program’s state at a given moment.

In Clojure we encounter a sensible compromise. Conventional variables do not exist, but thanks to appropriate reference types we are able to track and manage changing global state. This means that data mutations are permissible, but only as an exception and only using special constructs.

Reference objects in Clojure are equipped with concurrency-safe mechanisms for updating and reading the current values they point to. Instead of directly modifying the memory area containing data, an update to the reference takes place in such a way that the object begins pointing to a new value (placed at a different location). Values therefore remain constant and need not reside in variables, because the only element that undergoes mutation (overwriting of memory space) is the reference to them residing in reference objects.

The Var type

The most widely used reference type in Clojure is Var. It allows creating references to data placed in memory and changing those references within a given thread of execution. Additionally, a Var object may optionally contain a so-called root binding, which is shared across all threads and used when no thread-specific binding has been set.

By convention, Var objects in Clojure are interned in namespaces – there is no commonly used special form or function that would allow creating them without binding them to some symbolic identifier. There is, however, the with-local-vars macro, which creates Var objects in lexical scope – they must then be identified by given symbols visible within the expression passed as an argument. We can read the values they point to using the deref function or the dereference literal (the at sign placed before the symbol):

(with-local-vars [a 5] @a)
;=> 5
(with-local-vars [a 5] @a) ;=&gt; 5

See also:

Global variables

Objects of type Var (more precisely clojure.lang.Var) are used together with symbols in Clojure to create global variables. Global variables serve to identify infrequently changing identities, e.g., configuration values or functions defined in the program.

It works as follows: a particular Var is assigned to a symbolic name by placing an entry in one of the namespaces. This creates a global binding of a symbol to the current value of the Var object.

Let us look at a diagram illustrating how a function object is obtained from its name in our example program:

Resolving a symbolic function name

The evaluator performed 3 essential steps to obtain the invoked function’s subprogram:

  1. It searched the namespace clojure.core to find the value assigned to the symbol print.

  2. It obtained a Var object and passed it to the appropriate subprogram to read the current value of the reference.

  3. It found the function object pointed to by the reference variable, whose subprogram can be invoked.

Creating global variables is possible using the def special form:

 1(def x 8)                ; x points to the value 8
 2(def y [1 2 3])          ; y points to the vector [1 2 3]
 3(def write print)        ; write points to the current value of print variable
 4(declare later)          ; creates a global variable without binding to a value
 5
 6(write x)                ;=> 8
 7(write y)                ;=> [1 2 3]
 8(write "Hello, Lisp!")   ;=> Hello, Lisp!
 9
10later
11;=> #object[clojure.lang.Var$Unbound 0x53245a5b "Unbound: #'user/later"]
(def x 8) ; x points to the value 8 (def y [1 2 3]) ; y points to the vector [1 2 3] (def write print) ; write points to the current value of print variable (declare later) ; creates a global variable without binding to a value (write x) ;=&gt; 8 (write y) ;=&gt; [1 2 3] (write &#34;Hello, Lisp!&#34;) ;=&gt; Hello, Lisp! later ;=&gt; #object[clojure.lang.Var$Unbound 0x53245a5b &#34;Unbound: #&#39;user/later&#34;]

See also:

Bindings

A binding is, in general terms, an association of an identifier with an identified object. Using this term helps us avoid misunderstandings related to the subtle but – from the programmer’s perspective – important differences in how objects are referred to by names.

In imperatively rooted languages we often use the concept of a “variable” and expect, by force of habit, that it will have some name. The term lets us point to a memory slot in which we will find a value. In Clojure this approach could be misleading, because we can create nameless variables (reference objects) as well as name values that are not variables at all.

In Clojure we encounter several kinds of bindings:

The last item requires explanation, because – as is easy to notice – it contains no mention of any symbol, and we have repeatedly stated that it is symbols that serve to identify other objects.

At this point it is worth recalling that in Clojure symbols do not independently store references to values – this function is performed by reference objects (e.g., of type Var in the case of global variables). This follows from the adopted model of managing mutable state.

Examples of bindings
;; global variable
;; initial value pointed to by x is 1

(def x 1)
;=> #'user/x

;; parameter vector of a function definition
;; the first two arguments become parameters a and b

(fn [a b] (+ a b))
;=> #<Fn@5af25442 user/eval14406[fn]>

;; lexical binding
;; symbol a identifies the value 1

(let [a 1]
  a)
;=> 1

;; positional decomposition in let
;; symbol a bound to value 1 by position
;; symbol b bound to value 2 by position

(let [[a b] '(1 2 8)]
  (+ a b))
;=> 3

;; associative decomposition in let
;; symbol a bound to value 1 by key :a
;; symbol b bound to value 2 by key :b

(let [{a :a b :b} {:a 1 :b 2 :c 8}]
  (+ a b))
;=> 3

;; associative decomposition in let (using :keys)
;; symbol a bound to value 1 by key :a
;; symbol b bound to value 2 by key :b

(let [{:keys [:a :b]} {:a 1 :b 2 :c 8}]
  (+ a b))
;=> 3

;; binding a Var reference object to a value

(with-local-vars [x 5] x)
;=> #'Var: --unnamed-->

;; binding an Atom reference object to a value

(atom 5)
;=> #<Atom@a1cb453 5>
;; global variable ;; initial value pointed to by x is 1 (def x 1) ;=&gt; #&#39;user/x ;; parameter vector of a function definition ;; the first two arguments become parameters a and b (fn [a b] (+ a b)) ;=&gt; #&lt;Fn@5af25442 user/eval14406[fn]&gt; ;; lexical binding ;; symbol a identifies the value 1 (let [a 1] a) ;=&gt; 1 ;; positional decomposition in let ;; symbol a bound to value 1 by position ;; symbol b bound to value 2 by position (let [[a b] &#39;(1 2 8)] (+ a b)) ;=&gt; 3 ;; associative decomposition in let ;; symbol a bound to value 1 by key :a ;; symbol b bound to value 2 by key :b (let [{a :a b :b} {:a 1 :b 2 :c 8}] (+ a b)) ;=&gt; 3 ;; associative decomposition in let (using :keys) ;; symbol a bound to value 1 by key :a ;; symbol b bound to value 2 by key :b (let [{:keys [:a :b]} {:a 1 :b 2 :c 8}] (+ a b)) ;=&gt; 3 ;; binding a Var reference object to a value (with-local-vars [x 5] x) ;=&gt; #&#39;Var: --unnamed--&gt; ;; binding an Atom reference object to a value (atom 5) ;=&gt; #&lt;Atom@a1cb453 5&gt;

See also:

Collections

A collection is a data structure that allows storing a certain number of elements.

Lists

In Lisp the most commonly used data structure is the list, which – as we have had the chance to observe – serves both for the syntactic organization of program code (list S-expressions) and for storing ordered collections of elements for the purposes of application logic. It is worth thinking of a list as a way of arranging data, not merely a symbolic notation with parentheses.

There are several kinds of lists. The most commonly used in Lisp dialects are so-called linked lists, and more specifically singly linked lists. They are characterized by the ability to flexibly link elements together and to quickly add new ones to their beginnings. This is why they serve their role well as in-memory representations of Lisp program structures.

Lists in early Lisps

Historically speaking, each node of a list (in Lisp called a cons cell) has two slots: the first points to the value of the current element, and the second to the next element of the list (the next cons cell). In reality these are simply two pointers that can refer to arbitrary values.

Access to the first slot of each element is called car (from Contents of the Address part of Register number), and to the second cdr (from Contents of the Decrement part of Register number). The etymology of these peculiar names goes back to the time when Lisp was implemented on the IBM 704 computer (the 1950s). That machine had a special instruction that divided a 36-bit machine word into 4 parts – car and cdr are the abbreviated labels of the first two, and they found their way into Lisp because the author used them to divide the contents of the internal structure representing a list cell. In jargon, the first slot of a cons cell is therefore referred to as CAR, and the second as CDR.

There is nothing preventing a list from containing another list within it:

A Lisp list

In code we can express the above structure as:

(one two (1 2))
(one two (1 2))

that is:

1(one   ; first element of the list
2 two   ; second element of the list
3 (     ; third element of the list -- a nested list
4  1    ; first element of the nested list
5  2))  ; second element of the nested list
(one ; first element of the list two ; second element of the list ( ; third element of the list -- a nested list 1 ; first element of the nested list 2)) ; second element of the nested list

And here is the notation using so-called full notation, which is not used in Clojure. Inside each pair of parentheses we find two slots separated by a dot:

(one . (two . ((1 . (2 . nil)) . nil)))
(one . (two . ((1 . (2 . nil)) . nil)))

The dots separate the CAR and CDR registers of the cells, and the nil elements denote the ends of lists. If the second slot (CDR) is to point to the next element of the list, in this notation that next element is placed after the dot and enclosed in parentheses.

Let us remember, however, that the notation (one two (1 2)) is a valid S-expression but not a form – unless we define a function named one and associate the symbol two with some value. We can also use quoting to strip the construct of its special meaning so that it expresses a constant form:

'(one two (1 2))
&#39;(one two (1 2))

In the earliest editions of Lisp, lists were constructed using the cons function (from construct). For example:

(cons 'one (cons 'two (cons (cons 1 (cons 2 nil)) nil)))
(cons &#39;one (cons &#39;two (cons (cons 1 (cons 2 nil)) nil)))

Which can also be presented as:

1(cons 'one
2      (cons 'two
3            (cons (cons 1
4                        (cons 2 nil))
5                  nil)))
(cons &#39;one (cons &#39;two (cons (cons 1 (cons 2 nil)) nil)))

Such list construction may exercise the mind, but it does not serve productive programming. At present some Lisp dialects have dropped support for full notation, although the cons function is still used – mainly for performing operations on existing lists rather than constructing them from scratch.

Lists in Clojure

In Clojure, lists are represented by the host system’s object data type called clojure.lang.PersistentList. Objects of this type can be found in the syntax tree, where they reflect list S-expressions, as well as in application data when the list function is used or a literal list is produced by quoting its symbolically expressed form.

Examples of creating lists in Clojure
1(list 1 2 3)     ; using the list function (arguments will be evaluated)
2(quote (1 2 3))  ; literal list (arguments will not be evaluated)
3'(1 2 3)         ; literal list (arguments will not be evaluated)
(list 1 2 3) ; using the list function (arguments will be evaluated) (quote (1 2 3)) ; literal list (arguments will not be evaluated) &#39;(1 2 3) ; literal list (arguments will not be evaluated)

Internally, PersistentList objects are doubly linked lists, although this property is hidden from the programmer. Moreover, in Clojure a list (like most data structures) is immutable – to introduce a change in its structure one never modifies the object representing it but instead creates a new one that differs from the previous one.

See also:

“Lists”, chapter IX.

Sequences and Cons objects

Apart from encapsulated lists, Clojure also provides objects of type Cons (more precisely clojure.lang.Cons). They are the closest counterpart of cons cells known from other Lisp dialects. More precisely, using cons we can construct so-called sequences – abstract collections characterized by a uniform access interface consisting of three basic operations:

  • reading the value stored in the cell,
  • reading the next cell linked to the current one,
  • attaching a new cell to an existing one.

We can place references to values of arbitrary types in Cons objects and link them into sequences using the cons function. It accepts two arguments: a value and an object also equipped with a sequential interface (these include, among others, lists, vectors, and maps). The returned value will be a Cons cell whose next cell is the one provided as an argument.

If the provided existing object is nil, a single-element list will be returned.

Example of constructing a sequence in Clojure
1(def the-first  (cons 3 nil))          ; creates the list (3)
2(def the-second (cons 2 the-first))    ; creates cons(2)-->(3)
3(def the-last   (cons 1 the-second))   ; creates cons(1)-->cons(2)-->(3)
4
5(first the-last)         ; => 1     ; value in the last added cell
6(first (rest the-last))  ; => 2     ; value in the next cell
(def the-first (cons 3 nil)) ; creates the list (3) (def the-second (cons 2 the-first)) ; creates cons(2)--&gt;(3) (def the-last (cons 1 the-second)) ; creates cons(1)--&gt;cons(2)--&gt;(3) (first the-last) ; =&gt; 1 ; value in the last added cell (first (rest the-last)) ; =&gt; 2 ; value in the next cell

Using Cons we can join not only individual values but also collections (e.g., ordinary lists). We then have a compound structure that can be expressed, for example, like this:

1(cons (list 1 2 3)
2      (cons (list 4 5 6)
3            (cons 10
4                  ())))
5
6; => ((1 2 3) (4 5 6) 10)
(cons (list 1 2 3) (cons (list 4 5 6) (cons 10 ()))) ; =&gt; ((1 2 3) (4 5 6) 10)

Such structures are sometimes used to quickly join already grouped values into larger sets that are flattened only when being read in order to obtain individual values.

Example of flattening a nested sequence
1(flatten
2    (cons '(1 2 3) (cons '(4 5 6) (cons 10 nil))))
3
4; => (1 2 3 4 5 6 10)
(flatten (cons &#39;(1 2 3) (cons &#39;(4 5 6) (cons 10 nil)))) ; =&gt; (1 2 3 4 5 6 10)

See also:

“Cons cells”, chapter X.

Vectors

A vector is an ordered collection of values indexed using integers starting from 0 (the first element). In program code we use the vector literal to express vectors; we can also use several built-in functions.

Vectors in Clojure are represented by objects of type clojure.lang.PersistentVector. They are characterized by very fast data lookup and adding new elements to their ends. Vectors are also functions – they accept one argument, an index number, whose element value will be returned.

Examples of creating vectors
;; creating a vector (function)

(vector 1 2 3 4)
;=> [1 2 3 4]

;; creating a vector from an S-expression (literal)

[1 2 3 4]
;=> [1 2 3 4]

;; creating a vector from another collection

(vec (list 1 2 3 4))
;=> [1 2 3 4]

;; searching a vector (element at index 0)

([1 2 3 4] 0)
;=> 1
;; creating a vector (function) (vector 1 2 3 4) ;=&gt; [1 2 3 4] ;; creating a vector from an S-expression (literal) [1 2 3 4] ;=&gt; [1 2 3 4] ;; creating a vector from another collection (vec (list 1 2 3 4)) ;=&gt; [1 2 3 4] ;; searching a vector (element at index 0) ([1 2 3 4] 0) ;=&gt; 1

See also:

Maps

A map is an associative collection that stores mappings of keys to values. Keys and values may be data of any type, but the same key must be unique within a given map. Searching maps for a value identified by a key is very fast, as is adding new elements.

In Clojure, maps are represented by the following data types:

  • clojure.lang.PersistentHashMap (a map based on hash tables),
  • clojure.lang.PersistentArrayMap (a map based on plain arrays),
  • clojure.lang.PersistentTreeMap (a sorted map).

In program code we express maps using the map literal in the form of curly braces containing pairs of expressions. We can also use the hash-map function to express pairs as successive invocation arguments.

Examples of creating maps
;; creating a map (function)

(hash-map :a 1 :b 2)
;=> {:a 1 :b 2}

;; creating a sorted map (function)

(sorted-map :a 1 :b 2)
;=> {:a 1 :b 2}

;; creating a map from an S-expression (literal)

{:a 1 :b 2}
;=> {:a 1 :b 2}

;; creating a map from an S-expression (literal)

{:a 1 :b (+ 1 1)}
;=> {:a 1 :b 2}

;; creating a map from two sequential collections

(zipmap [:a :b] [1 :b 2])
;=> {:a 1 :b 2}

;; creating a map from a sequence of pairs

(into {} [[:a 1] [:b 2]])
;=> {:a 1 :b 2}

;; creating a map from a sequence

(apply hash-map [:a 1 :b 2])
;=> {:a 1 :b 2}
;; creating a map (function) (hash-map :a 1 :b 2) ;=&gt; {:a 1 :b 2} ;; creating a sorted map (function) (sorted-map :a 1 :b 2) ;=&gt; {:a 1 :b 2} ;; creating a map from an S-expression (literal) {:a 1 :b 2} ;=&gt; {:a 1 :b 2} ;; creating a map from an S-expression (literal) {:a 1 :b (+ 1 1)} ;=&gt; {:a 1 :b 2} ;; creating a map from two sequential collections (zipmap [:a :b] [1 :b 2]) ;=&gt; {:a 1 :b 2} ;; creating a map from a sequence of pairs (into {} [[:a 1] [:b 2]]) ;=&gt; {:a 1 :b 2} ;; creating a map from a sequence (apply hash-map [:a 1 :b 2]) ;=&gt; {:a 1 :b 2}

See also:

Sets

A set is a data structure that allows storing elements of arbitrary values with the restriction that a given value may appear in the set only once. Sets are characterized by very fast addition and lookup of elements. In Clojure there are built-in functions intended for performing operations on sets, e.g., join, projection, union, etc.

Sets are represented by objects of the following data types:

  • clojure.lang.PersistentSet (a set based on hash tables),
  • clojure.lang.PersistentTreeSet (a sorted set).
Examples of creating sets
;; creating a set (function)

(hash-set :a :b :c)
;=> #{:a :b :c}

;; creating a sorted set (function)

(sorted-set :a :b :c)
;=> #{:a :b :c}

;; creating a set from an S-expression (literal)

#{:a :b :c}
;=> #{:a :b :c}

;; creating a set from an S-expression (literal)

#{:a :b (keyword "c")}
;=> #{:a :b :c}

;; creating a set from a sequence

(set [:a :b :c])
;=> #{:a :b :c}
;; creating a set (function) (hash-set :a :b :c) ;=&gt; #{:a :b :c} ;; creating a sorted set (function) (sorted-set :a :b :c) ;=&gt; #{:a :b :c} ;; creating a set from an S-expression (literal) #{:a :b :c} ;=&gt; #{:a :b :c} ;; creating a set from an S-expression (literal) #{:a :b (keyword &#34;c&#34;)} ;=&gt; #{:a :b :c} ;; creating a set from a sequence (set [:a :b :c]) ;=&gt; #{:a :b :c}

See also:

Functions

A function is a construct containing code that, during invocation, computes a value for a given set of arguments. In purely functional languages this is the only result of using a function, while in others (including Clojure) functions may produce so-called side effects – that is, besides returning a value, they may affect their surroundings (e.g., modifying the value of a global variable or using the I/O subsystem).

Moreover, the result of a function’s operations in Clojure may depend on its surroundings. An example would be referencing global reference objects expressing some mutable state within its computations, or using values originating from the lexical environment of the function’s definition. In the latter case we call such a function a closure.

A function definition in Clojure consists of a list S-expression containing:

  • a vector S-expression defining the argument list,
  • one or more S-expressions constituting the function body.

In Clojure we can also use multi-arity functions, which can handle more than one set of arguments. In their case the definition syntax will be somewhat different.

The return value of a function is the computed value of the last S-expression from its body (or from the body of a given arity in the case of a multi-arity function).

In Clojure, to create anonymous functions we use the fn special form, and to create named functions we use the defn macro. Anonymous functions are function objects that have not been given names. Named functions in Clojure have names expressed through global variables.

Defining a function that displays text with an exclamation mark
1(defn shout           ; named function definition
2  [text]              ; vector with arguments
3  (print text "!"))   ; calling print with parameter text
4
5(shout "Hello, Lisp")    ; function call with argument
6;=>> Hello, Lisp!        ; result of call on screen
7;=> nil                  ; returned value
(defn shout ; named function definition [text] ; vector with arguments (print text &#34;!&#34;)) ; calling print with parameter text (shout &#34;Hello, Lisp&#34;) ; function call with argument ;=&gt;&gt; Hello, Lisp! ; result of call on screen ;=&gt; nil ; returned value

We can also create an anonymous version of the above function, but for the purposes of the example we must place the entire construct in an S-expression representing a function-call form so that we can observe the result of its execution:

Defining an anonymous function
1(                       ; function-call form
2  (fn                   ; anonymous function definition
3    [text]              ; vector with arguments
4    (print text "!"))   ; calling print with parameter text
5  "Hello, Lisp!")       ; function call argument
6
7;=>> Hello, Lisp!       ; result of call on screen
8;=> nil                 ; returned value
( ; function-call form (fn ; anonymous function definition [text] ; vector with arguments (print text &#34;!&#34;)) ; calling print with parameter text &#34;Hello, Lisp!&#34;) ; function call argument ;=&gt;&gt; Hello, Lisp! ; result of call on screen ;=&gt; nil ; returned value

The above can also be expressed using the anonymous function literal:

Using the anonymous function literal
1(                    ; function-call form
2  #(print %1 "!")    ; definition and calling print with first parameter
3  "Hello, Lisp!")    ; function call argument
4
5;=>> Hello, Lisp!    ; result of call on screen
6;=> nil              ; returned value
( ; function-call form #(print %1 &#34;!&#34;) ; definition and calling print with first parameter &#34;Hello, Lisp!&#34;) ; function call argument ;=&gt;&gt; Hello, Lisp! ; result of call on screen ;=&gt; nil ; returned value

Nothing prevents us from defining an anonymous function and then assigning it a name ourselves:

Defining a named function manually
1(def shout                ; global variable definition
2  (fn                     ; anonymous function definition
3    [text]                ; vector with arguments
4    (print text "!")))    ; calling print with parameter text
5
6(shout "Hello, Lisp")     ; function call with argument
7;=>> Hello, Lisp!         ; result of call on screen
8;=> nil                   ; returned value
(def shout ; global variable definition (fn ; anonymous function definition [text] ; vector with arguments (print text &#34;!&#34;))) ; calling print with parameter text (shout &#34;Hello, Lisp&#34;) ; function call with argument ;=&gt;&gt; Hello, Lisp! ; result of call on screen ;=&gt; nil ; returned value

In the above example we define a global variable shout with an initial value assigned to the result of invoking the fn special form, which will be a function object – a representation of an existing function.

In Lisp, functions are first-class citizens – values that can be treated like any other: passed as arguments or returned as results of function execution.

See also:

Boolean values

To express boolean values in Clojure we can use the singletons of type java.lang.Boolean: true denoting logical truth and false denoting logical falsehood. These values are returned by built-in predicates (functions that return logical truth or falsehood based on their arguments) and conditional constructs.

Indeterminate value

The literal for the indeterminate value nil, whose name comes from the English expression not in list, has special meaning in Clojure and serves to express an object representing an indeterminate value or the absence of a value. Internally, nil is represented by the Java value null (a null pointer).

The nil construct is used in Clojure to communicate that a given form does not return any value, or that a value could not be obtained. Many built-in functions and conditional constructs use it. In the case of functions, nil denotes the absence of a result (returned value) or a default initial value (accepted argument), while in the case of conditionals it corresponds to the logical value of falsehood (false). Furthermore, in sequence handling the value nil is used to mark their ends.

Comments

Comments in computer programs serve to explain fragments of source code in natural language. They may serve a documentary purpose or contain notes regarding progress on certain tasks. Sometimes they are also used to present the results of computing example forms.

In Clojure a comment begins with the semicolon character (;), after which we place arbitrary text. The comment will consist of any sequence of characters up to the end of the line in which it was started:

 1(print "Hello, Lisp!")  ; this is a comment
 2
 3;; This is a longer comment
 4;; written according to the convention
 5;; of preceding each line
 6;; with two semicolons.
 7
 8;; The comments below
 9;; are used to present computation results.
10
11(+ 2 2)  ; => 4
12(- 2 2)  ; => 0
13
14(+ 2 2)
15; => 4
16
17'(1 2 3)
18; => (1 2 3)
(print &#34;Hello, Lisp!&#34;) ; this is a comment ;; This is a longer comment ;; written according to the convention ;; of preceding each line ;; with two semicolons. ;; The comments below ;; are used to present computation results. (+ 2 2) ; =&gt; 4 (- 2 2) ; =&gt; 0 (+ 2 2) ; =&gt; 4 &#39;(1 2 3) ; =&gt; (1 2 3)

Comment form

Apart from syntactic comments expressed with the semicolon, Clojure provides a special macro that allows disabling the computation of a given S-expression:

(comment
    (print "Hello, Lisp!")
    123)
(comment (print &#34;Hello, Lisp!&#34;) 123)

It is worth noting that comment is a construct to which syntactically valid symbolic expressions must be passed as arguments.

The value returned by the macro is nil.

Ignoring the next form

Another way of excluding part of a program from the value-computation process is to use the reader macro #_. It causes the next form placed after it to not be evaluated:

#_ (print "Hello, Lisp!")        ; disables the function-call form
(#_#_ print "Hello, Lisp!")      ; disables the first two nested forms
(print #_ "Hello, Lisp!")        ; disables the constant form "Hello, Lisp!"
[#_#_ 1 2 3]                     ; disables two nested constant forms 1 and 2
#_ (print &#34;Hello, Lisp!&#34;) ; disables the function-call form (#_#_ print &#34;Hello, Lisp!&#34;) ; disables the first two nested forms (print #_ &#34;Hello, Lisp!&#34;) ; disables the constant form &#34;Hello, Lisp!&#34; [#_#_ 1 2 3] ; disables two nested constant forms 1 and 2

Control structures

Clojure is equipped with many special forms and macros that serve to control the expression-evaluation process, including the order of evaluation, the number of repetitions, and conditional processing.

Conditional expressions

The conditional statements known from imperative languages have counterparts in Lisps as conditional special forms or macros. A special form and a macro differ from a function in that certain arguments passed to them may or may not be evaluated (depending on the specific form) before the subprogram is invoked. Thanks to this property it is possible to build, among other things, expressions in which the evaluation of certain arguments depends on the result of evaluating other (earlier) ones. We can thus control whether and when the pass-by-value principle takes effect, and thereby build conditional constructs or loops.

Special forms and macros will be discussed later, and the fundamental difference between them is that macros are a language feature allowing source code transformations, whereas special forms are constructs built into the evaluator. Both may have evaluation and argument-passing behavior that differs from the default, and may affect the surroundings (the state of other in-memory objects).

Conditional execution, if

The if special form serves for conditional evaluation of expressions.

Usage:

  • (if predicate true-expression false-expression?).

The if form accepts two mandatory arguments. The first is a condition to be evaluated; if its value is neither false nor nil, the next argument will also be evaluated. If an optional third argument is provided, it will be evaluated only when the first argument has the value false or nil.

The if special form returns the value of the most recently evaluated expression, or nil if no third argument was provided and the first expression returned a logical falsehood.

Example of the if special form
1(if (= 2 2)
2  "equal"
3  "different")
4; => "equal"
(if (= 2 2) &#34;equal&#34; &#34;different&#34;) ; =&gt; &#34;equal&#34;

Together with the if form, the and macro is frequently used. It serves to express the logical conjunction (logical AND) operation. It evaluates the values of expressions given as its arguments (in the order they appear) as long as their value is logical truth (not false and not nil). The returned value is then the value of the last given expression. If any expression returns false or nil, processing of subsequent arguments is halted and its value is returned.

Example of the and macro
1(if (and (= 2 2) (< 2 3)) true)
2; => true
(if (and (= 2 2) (&lt; 2 3)) true) ; =&gt; true

Another macro found alongside if is or. It serves to express the logical disjunction (logical OR) operation. This macro evaluates the values of successive expressions given as its arguments (in the order they appear) until the value of one of them is logical truth (not false and not nil). The returned value is then the value of the most recently processed expression. If the values of all expressions are false or nil, the value of the last given expression is returned (false or nil).

Example of the or macro
1(if (or (= 2 2) (< 4 3)) true)
2; => true
(if (or (= 2 2) (&lt; 4 3)) true) ; =&gt; true

Conditional execution, if-not

The if-not macro is the inverse version of the if special form.

Usage:

  • (if-not predicate false-expression true-expression?).

The macro accepts two mandatory arguments. The first is a condition to be evaluated; if its value is logical falsehood (false or nil), the next argument will also be evaluated. If an optional third argument is provided, it will be evaluated only when the first argument represents logical truth (has a value different from false and different from nil).

The if-not macro returns the value of the most recently evaluated expression, or nil if no third argument was provided and the first expression returned logical truth.

Example of the if-not macro
1(if-not (= 2 2)
2  "different"
3  "equal")
4; => "equal"
(if-not (= 2 2) &#34;different&#34; &#34;equal&#34;) ; =&gt; &#34;equal&#34;

Conditional execution, when

The when macro works similarly to the if construct, but is somewhat simpler because there is no place for an alternative expression (one executed when the given condition is not met).

Usage:

  • (when predicate & expression...).

The first argument should be a form expressing a logical condition (one returning true or false, or possibly nil), and each remaining argument will be treated as an expression that will be evaluated when the value returned by the conditional construct is logical truth (a value different from false and different from nil).

Example of the when macro
1(when (= 2 2)
2  (println "you can provide")
3  (println "multiple expressions"))
4
5; >> you can provide
6; >> multiple expressions
(when (= 2 2) (println &#34;you can provide&#34;) (println &#34;multiple expressions&#34;)) ; &gt;&gt; you can provide ; &gt;&gt; multiple expressions

Conditional execution, when-not

The when-not macro is the inverse version of the when macro.

Usage:

  • (when-not predicate & expression...).

The first argument should be a predicate, and each remaining argument will be treated as an expression that will be evaluated when the predicate’s value is logical falsehood (false or nil).

Example of the when-not macro
1(when-not (= 2 1)
2  (println "you can provide")
3  (println "multiple expressions"))
4
5; >> you can provide
6; >> multiple expressions
(when-not (= 2 1) (println &#34;you can provide&#34;) (println &#34;multiple expressions&#34;)) ; &gt;&gt; you can provide ; &gt;&gt; multiple expressions

Condition list, cond

The cond macro allows writing a list of conditions with corresponding expressions that will be evaluated when a given condition is met.

Usage:

  • (cond & pair...),

where pair is:

  • predicate expression,
  • :else    default-expression.

The macro accepts an even number of arguments. For each pair it checks whether the given conditional expression returns a value different from false and from nil. If so, the associated argument from the pair (the expression) will be evaluated, processing of further arguments will stop, and the returned value will be the value of the evaluated expression.

Optionally, a final pair (option) can be added whose first element equals :else. Its associated expression will be evaluated when none of the earlier options terminated the process. This allows setting a default return value.

If no default option was given and no condition was met, the returned value will be nil.

Example of the cond macro
1(cond
2  (= 2 3) "2 and 3 are equal"
3  (< 2 3) "2 is less than 3"
4  :else   "neither")
5
6; => "2 is less than 3"
(cond (= 2 3) &#34;2 and 3 are equal&#34; (&lt; 2 3) &#34;2 is less than 3&#34; :else &#34;neither&#34;) ; =&gt; &#34;2 is less than 3&#34;

Case list, case

The case special form works similarly to cond, but instead of conditional expressions in the options, constant values should be given that will be matched against the value of the first argument provided.

Usage:

  • (case value option...),

where option is:

  • value      expression,
  • (value...) expression,
  • default-expression.

If we want to compare the given value against multiple values from a series for each possible option, that series of possibilities should be written as a list S-expression.

The values against which the first argument’s value will be compared must be constant forms, since they are processed at compile time. We may therefore provide literals or other atomic representations that express their own values. The expressions to be evaluated upon a positive test may, in turn, represent arbitrary forms.

Optionally, a last argument can be provided as an expression that will be evaluated if the tested value does not satisfy any of the earlier conditions. It is worth using this possibility, because otherwise the program will throw an exception when no match is found.

Example of the case special form
1(def x 'cookies) ; x refers to the symbol cookies
2
3(case x
4  ""                    "nothing"
5  (mints dragees)       "sweets"
6  (cookies gingersnaps) "biscuits"
7  (str x " is an unknown product"))
8
9; => "biscuits"
(def x &#39;cookies) ; x refers to the symbol cookies (case x &#34;&#34; &#34;nothing&#34; (mints dragees) &#34;sweets&#34; (cookies gingersnaps) &#34;biscuits&#34; (str x &#34; is an unknown product&#34;)) ; =&gt; &#34;biscuits&#34;

It is worth noting that the symbols and lists were not quoted. This is precisely a property of special forms – certain expressions are not evaluated, so there is no need to employ special techniques to prevent evaluation.

Condition list with predicate, condp

The condp macro works similarly to cond, but is much more flexible.

Usage:

  • (condp predicate expression & pair...)

The first argument should be a predicate – a function that should return a boolean value – and the second an expression. The subsequent, optional arguments are options composed of pairs, which can be expressed using the following notations:

  • test-expression result-expression,
  • test-expression :>> result-function,
  • default-expression.

The symbolic notation :>> is a keyword that carries special meaning in the context of the condp macro.

For each given option, the associated function-call form will be evaluated:

  • (predicate test-expression expression).

The predicate will be invoked for each option given in order with two passed arguments. The first will be the value of the test expression of the given option, and the second will be the expression given as the argument to the entire macro invocation.

If the predicate returns a value corresponding to logical truth (different from false and different from nil), checking of subsequent options will stop and the result expression associated with the current option will be returned, or the value produced by invoking the result function (in the case where the :>> keyword was used). In this latter case, the value returned by the predicate will be passed to the function (which should accept one argument).

If a final additional expression was given as the last option (default-expression), its value will be returned if none of the options matched the given data. When such a situation occurs but no default expression was provided, the returned value will be nil.

Example of the condp macro
 1(condp
 2  =           ; predicate
 3  (+ 2 2)     ; expression (yields 4)
 4  1 "one"     ; option 1: string
 5  2 "two"     ; option 2: string
 6  3 "three"   ; option 3: string
 7  4 "four"    ; option 4: string
 8  "none")     ; default option expression
 9
10; => "four"
(condp = ; predicate (+ 2 2) ; expression (yields 4) 1 &#34;one&#34; ; option 1: string 2 &#34;two&#34; ; option 2: string 3 &#34;three&#34; ; option 3: string 4 &#34;four&#34; ; option 4: string &#34;none&#34;) ; default option expression ; =&gt; &#34;four&#34;

Java member access

Member access, .

The . special form in Clojure provides access to Java objects. It is de facto a member-access operator.

  • When the first argument is a symbol naming a Java class, the evaluator will treat the notation as access to a static member of that class.

  • If the first argument is not a Java class name, the evaluator will assume it is a class instance on which we want to invoke a method or access a field value. If no further arguments are provided, it will be a field – unless a public method with the given name that takes no arguments exists, in which case it will be invoked. If both a field with the given name and a method exist, prefixing the symbolic name with a hyphen (-) will indicate that the field is intended.

  • If additional arguments are given beyond the first, a method of the instance with the name given as the first argument will be invoked. The additional arguments will (from left to right) be passed to the invocation after their values are evaluated.

Additional arguments may also be a list S-expression. In such a case, the first element should be a symbol identifying the method to invoke.

There is also a simplified version of this special form in which the dot is part of the symbolic name of the object’s member, which is given as the next argument.

Examples of the . special form
(. " a " trim)             ; => a
(.trim " a ")              ; => a
(. "abc" substring 0 1)    ; => a
(. "abc" (substring 0 1))  ; => a
(. &#34; a &#34; trim) ; =&gt; a (.trim &#34; a &#34;) ; =&gt; a (. &#34;abc&#34; substring 0 1) ; =&gt; a (. &#34;abc&#34; (substring 0 1)) ; =&gt; a

Member access, ..

The macro named .. is syntactic sugar that allows expressing nested access to Java object members in a readable way. The effect of using it is the creation of a cascade of invocations of the . special form discussed above.

In practice, .. performs a series of method calls in which the result of one becomes the input object for the next. Each expression after .. is interpreted similarly to the . form, but without the need for manual nesting.

Examples of the .. macro
(.. " a " trim length)                             ; => 1
(.. " a " trim length toString (replace "1" "X"))  ; => "X"
(.. &#34; a &#34; trim length) ; =&gt; 1 (.. &#34; a &#34; trim length toString (replace &#34;1&#34; &#34;X&#34;)) ; =&gt; &#34;X&#34;

Code organization

In a functional program every construct is an expression that can be computed and returns some value without generating side effects. However, there are classes of problems that, solved in a purely functional way, would produce unreadable or inefficient code. Lisp dialects are therefore tolerant of how algorithms are represented and allow expressing solutions in various ways, offering among other things control structures known from imperative programming languages.

Grouping S-expressions, do

Perhaps the most common construct that lets us momentarily forget that we are dealing with evaluated expressions and treat program elements as statements is the do special form. It serves to group multiple expressions in such a way that only the value of the last one is returned, although side effects (e.g., displaying text on screen) may occur in all of them.

Usage:

  • (do & expression...).

The do special form accepts zero or more S-expressions, which will be evaluated in the order they are given. The returned value will be the result of evaluating the constant value of the last one.

If no S-expression is given, the returned value will be nil.

Examples of the do special form
 1(do)
 2; => nil
 3
 4(do 123 456)
 5; => 456
 6
 7(do
 8  (println "test")
 9  123
10  (+ 1 1)
11  (+ 2 2))
12; >> test
13; => 4
(do) ; =&gt; nil (do 123 456) ; =&gt; 456 (do (println &#34;test&#34;) 123 (+ 1 1) (+ 2 2)) ; &gt;&gt; test ; =&gt; 4

Appending an argument, doto

The doto macro allows appending a given value as the first argument to the invocations of the given expressions. It is most commonly used to represent Java method calls in a clear way, or when we want the returned value to be the value of the first argument of some function, macro, or special form invocation.

  • (doto value & expression...).

The macro accepts one mandatory argument, which should be an arbitrary value. Each subsequent argument will be treated as an S-expression that, before evaluation, will be transformed so that the given value is substituted in the position of its first invocation argument.

The value returned by the doto macro is the value passed as the first argument.

Examples of the doto macro
 1(doto 2
 2  (println "the first time")
 3  (println "the second time"))
 4
 5; >> 2 the first time
 6; >> 2 the second time
 7; => 2
 8
 9(doto (java.util.ArrayList.)
10  (.add 1)
11  (.add 2)
12  (.add 3))
13
14; => [1 2 3]
(doto 2 (println &#34;the first time&#34;) (println &#34;the second time&#34;)) ; &gt;&gt; 2 the first time ; &gt;&gt; 2 the second time ; =&gt; 2 (doto (java.util.ArrayList.) (.add 1) (.add 2) (.add 3)) ; =&gt; [1 2 3]

Timing, time

The time macro allows us to display the execution time of a given expression. It is useful when we want to optimize parts of a program for performance.

Usage:

  • (time expression).

The macro’s argument should be an S-expression whose value will be computed. do can be used if we want to provide more expressions.

The value returned by the time macro is the value of the evaluated expression that was given.

A side effect of invoking the macro is the display of an execution-time report on standard output.

Examples of the time macro
1(time 123)
2; >> "Elapsed time: 0.012667 msecs"
3; => 123
4
5(time (Thread/sleep 500))
6; >> "Elapsed time: 505.079375 msecs"
7; => nil
(time 123) ; &gt;&gt; &#34;Elapsed time: 0.012667 msecs&#34; ; =&gt; 123 (time (Thread/sleep 500)) ; &gt;&gt; &#34;Elapsed time: 505.079375 msecs&#34; ; =&gt; nil

Threading, ->

The threading macro, also called the thread-first macro, enables presenting source code with nested invocations in a more readable way. It works by accepting a set of S-expressions and treating each one except the first as a call form (even if it is not enclosed in parentheses), and then inserting in the position of the first argument of each form the value returned by the preceding invocation. Thanks to this we can momentarily forget about Polish notation.

Usage:

  • (-> expression & expression...).

The macro’s arguments are S-expressions (at least one), and the returned value is the value of the last provided S-expression.

Example of the -> macro
1(-> [1 2 3 4]  ; result [1 2 3 4]
2    last       ; result (last [1 2 3 4]) => 4
3    (- 2)      ; result (- 4 2)          => 2
4    println    ; result (println 2)      => nil
5)
6
7; >> 2
(-&gt; [1 2 3 4] ; result [1 2 3 4] last ; result (last [1 2 3 4]) =&gt; 4 (- 2) ; result (- 4 2) =&gt; 2 println ; result (println 2) =&gt; nil ) ; &gt;&gt; 2

Let us look at the same operations arranged in code without using the macro:

1(println
2  (- (last
3      [1 2 3 4]))
4     2)
5
6; => nil
7; >> 2
(println (- (last [1 2 3 4])) 2) ; =&gt; nil ; &gt;&gt; 2

Let us compare both notations in more compact forms:

1(println (- (last [1 2 3 4])) 2)   ; from the inside out
2(-> [1 2 3 4] last (- 2) println)  ; from left to right
(println (- (last [1 2 3 4])) 2) ; from the inside out (-&gt; [1 2 3 4] last (- 2) println) ; from left to right

The difference is easy to see: in the first we evaluate values from the most nested to the enclosing ones, and in the second from left to right (with the result substituted each time as the first argument).

Thread-last, ->>

The thread-last macro works similarly to the threading macro, but the value of each preceding S-expression will be substituted not as the first but as the last argument of the next one.

Usage:

  • (->> expression & expression...).

The macro’s arguments are S-expressions (at least one), and the returned value is the value of the last provided S-expression.

Example of the ->> macro
1(->> [1 2 3 4]  ; result [1 2 3 4]
2     last       ; result (last [1 2 3 4]) => 4
3     (- 2)      ; result (- 2 4)          => -2
4     println    ; result (println -2)     => nil
5)
6
7; => nil
8; >> -2
(-&gt;&gt; [1 2 3 4] ; result [1 2 3 4] last ; result (last [1 2 3 4]) =&gt; 4 (- 2) ; result (- 2 4) =&gt; -2 println ; result (println -2) =&gt; nil ) ; =&gt; nil ; &gt;&gt; -2

In the above example we can see that the difference in computations compared to the previous examples appears in line 3. The subtraction arguments are swapped because the value is substituted as the second argument of the subtraction.

Threading non-nils, some->

The some-> macro works similarly to the -> macro, but threading and further evaluation of expressions stops immediately when any of them in the chain returns nil. For this reason the non-nil result threading macro is used where some form might return an indeterminate value, e.g., when searching collections or accessing Java methods that, if called with a nil argument, would produce an exception.

Usage:

  • (some-> expression & expression...).

The macro’s arguments are S-expressions (at least one), and the returned value is the value of the last provided S-expression, or nil if the value of any of them is nil.

Example of the some-> macro
1(some-> [1 2 3 4]  ; result [1 2 3 4]
2        get 50     ; result (get [1 2 3 4] 50)  => nil
3        (+ 2)      ; not evaluated
4        println    ; not evaluated
5)
6
7; => nil
(some-&gt; [1 2 3 4] ; result [1 2 3 4] get 50 ; result (get [1 2 3 4] 50) =&gt; nil (+ 2) ; not evaluated println ; not evaluated ) ; =&gt; nil

We can see that in line 2 further evaluation of S-expressions in the chain was halted because the form (get [1 2 3 4] 50) returned nil (when attempting to access a nonexistent element at index 50). Let us see what would happen if we used -> instead of some->:

1(-> [1 2 3 4]  ; result [1 2 3 4]
2    get 50     ; result (get [1 2 3 4] 50)  => nil
3    (+ 2)      ; result (+ nil 2)  (exception)
4    println    ; not evaluated (exception)
5)
6
7; >> java.lang.NullPointerException:
(-&gt; [1 2 3 4] ; result [1 2 3 4] get 50 ; result (get [1 2 3 4] 50) =&gt; nil (+ 2) ; result (+ nil 2) (exception) println ; not evaluated (exception) ) ; &gt;&gt; java.lang.NullPointerException:

Instead of obtaining the indeterminate value nil, the program terminated abnormally, reporting a Java exception. This was caused by an attempt to sum nil and the number 2.

Threading non-nils, some->>

The some->> macro works similarly to some->, but the values of previously evaluated S-expressions are substituted as the last arguments of subsequent ones.

Usage:

  • (some->> expression & expression...).

The macro’s arguments are S-expressions (at least one), and the returned value is the value of the last provided S-expression, or nil if the value of any of them is nil.

Example of the some->> macro
1(some->> [2 4 6 8]      ; result [2 4 6 8]
2         (filter odd?)  ; result (filter odd? [2 4 6 8]) => ()
3         first          ; result (first ()) => nil
4         (+ 2)          ; not evaluated
5         println        ; not evaluated
6)
7
8; => nil
(some-&gt;&gt; [2 4 6 8] ; result [2 4 6 8] (filter odd?) ; result (filter odd? [2 4 6 8]) =&gt; () first ; result (first ()) =&gt; nil (+ 2) ; not evaluated println ; not evaluated ) ; =&gt; nil

In line 2 we used the not-yet-discussed function filter. It filters the collection given as the second argument, and to decide whether an element will be included in the output data structure or skipped it uses the predicate passed as the first invocation argument.

Here the role of the predicate is played by the odd? function, which returns logical truth when the value of its argument is an odd number. As a result, filter will produce an empty collection because the input contains only even numbers.

Invoking first on an empty collection returns nil, and at that point threading ends. Had we not used the some->> macro, the S-expression (+ 2 nil) would have led to an exception being thrown during evaluation.

Conditional threading, cond->

The cond-> macro works similarly to ->, but S-expressions are interpreted in pairs. The first of each pair is a conditional expression, and the second will be evaluated only when the first is truthy (its value is neither false nor nil). Each second expression of a pair is treated as a call form even if it is not enclosed in parentheses.

Conditional threading of the first argument is useful for clearly expressing a set of computations that depend on current conditions.

Usage:

  • (cond-> expression & pair...),

where pair is:

  • predicate expression.

The cond-> macro accepts an initial expression and pairs of S-expressions, where the first is a predicate and the second will be evaluated in such a way that the result of evaluating the previous expression in the chain is inserted as the first argument. The returned value is the value of the most recently evaluated S-expression.

Example of the cond-> macro
 1(def add-one?      true)
 2(def subtract-one? false)
 3
 4(cond-> 123              ; result 123
 5        add-one?      inc     ; result (inc 123) => 124
 6        subtract-one? dec     ; not evaluated
 7        true          println ; result (println 124) => nil
 8)
 9
10; => nil
11; >> 124
(def add-one? true) (def subtract-one? false) (cond-&gt; 123 ; result 123 add-one? inc ; result (inc 123) =&gt; 124 subtract-one? dec ; not evaluated true println ; result (println 124) =&gt; nil ) ; =&gt; nil ; &gt;&gt; 124

Conditional threading, cond->>

The conditional thread-last macro works similarly to cond->, but – as the name suggests – the value of the most recently evaluated S-expression is substituted in the position of the last argument of the currently evaluated expression.

Conditional threading of the last argument is useful for clearly expressing transformations performed on collections, since many built-in functions operating on such data structures accept them as their last arguments.

Usage:

  • (cond->> expression & pair...),

where pair is:

  • predicate expression.

The cond->> macro accepts an initial expression and pairs of S-expressions, where the first is a predicate and the second will be evaluated in such a way that the result of evaluating the previous expression in the chain is inserted as the last argument. The returned value is the value of the most recently evaluated S-expression.

Example of the cond->> macro
 1(def evens? true)
 2(def odds? false)
 3
 4(cond->> [1 2 3 4]                   ; [1 2 3 4]
 5         evens?    (filter even?)    ; (filter even? [1 2 3 4]) => (2 4)
 6         odds?     (filter odd?)    ; not evaluated
 7         true      println          ; (println '(2 4)) => nil
 8)
 9
10; => nil
11; >> (2 4)
(def evens? true) (def odds? false) (cond-&gt;&gt; [1 2 3 4] ; [1 2 3 4] evens? (filter even?) ; (filter even? [1 2 3 4]) =&gt; (2 4) odds? (filter odd?) ; not evaluated true println ; (println &#39;(2 4)) =&gt; nil ) ; =&gt; nil ; &gt;&gt; (2 4)

Binding to a name, as->

The as-> macro serves for threading expression results, but instead of a fixed position we are dealing with the creation of bindings to a symbolic name. This is useful in situations where different invocations require placing the significant argument at different positions.

Usage:

  • (as-> expression symbolic-name & expression...).

The macro’s arguments are an initial expression whose value will be bound to the symbolic name (given as the second argument), and then threaded through the subsequent S-expressions in such a way that the value of each is bound to the symbolic name before the next one is invoked.

Example of the as-> macro
1(as-> [1 2 3 4] result       ; after substituting result:
2      (filter even? result)  ; (filter even? [1 2 3 4])
3      (last result)          ; (last [2 4])
4      [result (inc result)]) ; [4 5]
5
6; => [4 5]
(as-&gt; [1 2 3 4] result ; after substituting result: (filter even? result) ; (filter even? [1 2 3 4]) (last result) ; (last [2 4]) [result (inc result)]) ; [4 5] ; =&gt; [4 5]

Summary

Translating source code

The diagram below presents the individual stages of translating a program’s source code from its textual form into an executable one, including the process of running and obtaining expression values.

Translation of a program to executable form in Clojure

Questions and answers

Let us answer a few questions that will help summarize the introduction to the basic mechanisms present in Lisps (and in Clojure):

  • What is the difference between an S-expression and a list?

    An S-expression is a grammatical element of syntax – a symbolic way of organizing source code in a text file. A list, on the other hand, is an in-memory data structure that allows storing data in a nested form while preserving information about the order of occurrence. A list is a good way of representing objects defined by list S-expressions owing to the similarity of arrangement. In Lisp dialects, lists are used to store data expressing list S-expressions of source code in the syntax tree, but they can also be used in programs for operating on utility data.

  • What is the difference between an S-expression and an atom?

    An atom is also an S-expression, but a non-compound one. In the abstract syntax tree it will not be represented by a list object, and in the textual form of the program it will not be expressed by elements placed in parentheses.

  • What is the difference between an S-expression and a form?

    A form is a syntactically valid S-expression read into memory and converted to known data structures, which is ready for evaluation (computation of its value in the evaluation process).

  • What is the difference between a symbol and a global variable?

    A global variable is a Var object placed in a map called a namespace that stores a reference to another object (a value). A symbol is a Symbol object that does not store any reference, but in the same map is used to identify the Var object (it is the indexing key). One could say that a symbol names a global variable.

See also:

Current section: Read Me Clojure
Categories:

Taxonomies: