Syntax

This document is a work-in-progress that will be updated during Winter 2023.

Haskell has many syntactic forms, which can be daunting when starting out. Here is an incomplete description of Haskell syntax, largely (but not entirely) in the order in which they are encountered in the lecture notes. The primary purpose of this reference is to develop intuitions about the different building blocks — thus, some minor or technical details deviate from the actual language definition and implementation.

Metavariables

	Syntactic Form	Color
e	expression	evergreen
t	type	tan
v	value
x	variable	violet
p	pattern	pink
k	kind
…	(other)

GHCi

> e v

> :t e t

> :k t k

Basic Syntax

The following lists include most of the syntactic forms encountered in Introduction and Lists.

An expression e can take many forms, including:

()
True , False
‘C’ , “MSC” , …
1 , 2 , 3.14 , …
(e_1, …, e_n)
[e_1, …, e_n]
e_1 e_2
e_1 binop e_2
(binop) e_1 e_2
let x = e_1 in e_2
x
case e_1 of { True -> e_2; False -> e_3 }
case e_1 of { Just x -> e_2; Nothing -> e_3 }
case e_1 of { (x:xs) -> e_2; [] -> e_3 }
e :: t

A type t can take many forms, including:

() , Bool , Char , String , Int , Integer , Double , …
(_t_1, …, _t_n)
[_t]
_t_1 -> … -> _t_n
{- forall a b … . -} _t_1 -> … -> _t_n
a

By default in Haskell, all universally quantified variables in a type t must appear all the way to the left — that is, types must be in prenex normal form. Thus, we write the metavariable _t (with an underscore) to range over all type forms except universal quantification.

Where Clauses

e_2 where x = e_1 = let x = e_1 in e_2

Conditionals

if e_1 then e_2 else e_3 = case e_1 of { True -> e_2; False -> e_3 }

Guarded Equations

f x_1 … x_n | pred_1 = e_1 … | pred_n = e_n = f x_1 … x_n = if pred_1 then e_1 else if … else if pred_n then e_n else undefined

Multiple Equations

f p_11 … p_1m = e_1 … f p_n1 … p_nm = e_n = f x_1 … x_n = case (x_1, …, x_m) of (p_11, …, p_1m) -> e_1 … (p_n1, …, p_nm) -> e_n

Source Files

module ModuleName where import ModulePath_1 … name_1 :: type_1 name_1 = exp_1 … func_1 :: argType_1 -> argType_m -> retType_1 func_1 arg_11 … arg_1m = funcBody_1 func_1 arg_n1 … arg_nm = funcBody_n … main :: IO () main = mainExp

Do-notation (for IO)

do { stmt_1; …; stmt_n; e }

Each stmt takes one of three forms:

let x_i = e_iwhere e_i :: t_i for some t_i and thus x_i :: t_i
x_i <- e_iwhere e_i :: IO t_i for some t_i and thus x_i :: t_i
e_iwhere e_i :: IO t_i for some t_i

The first kind of statement is an ordinary (side-effect free) let-binding. Notice that let-bindings within a do-block do not use the keyword in.

The second kind of statement runs a (potentially side-effecting) action e_i of some IO t_i type; the result of type t_i is bound to the name x_i.

The third kind of statement is just like the second, except that the result is not bound to a name for subsequent reference. It is syntactic sugar for _ <- e_i, where _ is a wild-card pattern.

The type of each stmt can be, and often is, different.

The final statement must be an expression e; its type IO t is the overall type of the entire do-block.

Algebraic Datatypes (ADTs)

Type Definitions

type T = C_1 t_11 t_12 … | C_2 t_21 t_22 … | … deriving (Class_1, Class_2, …)

Case Expressions

case e of { p_1 -> e_1; …; p_n -> e_n; }

Patterns

Each pattern p takes one of three forms:

x
_
C p_1 … p_n

Polymorphic ADTs

type T a b … = C_1 … | C_2 … | …

Optional Deriving Clause

type T … = C_1 … | C_2 … | … deriving (Class_1, Class_2, …)

Let as Case

let p = e_1 in e_2 ≈ case e_1 of p -> e_2

Notice the approximately equals sign: this rewriting works only for non-recursive definitions.

Type Classes

Lambdas

\x -> e

\p -> e = \x -> case x of p -> e

Sections

(n binop) = \m -> n binop m

(binop m) = \n -> n binop m

List Comprehensions

[ returnExp | stmt_1 , … , stmt_n ]

Each stmt, separated by commas, takes one of three forms:

let x = e let-binding
x <- xs “iterate” over the list xs
e filter according to Boolean predicate

A list comprehension gets desugared as follows

[ returnExp | stmt_1 , … , stmt_n ] = ⟦ stmt_1 , … , stmt_n ⟧

where its statements are desugared left-to-right:

⟦ let x = e, stmts ⟧ = let x = e in ⟦ stmts ⟧
⟦ x <- xs, stmts ⟧ = flip concatMap e (\x -> ⟦ stmts ⟧)
⟦ e, stmts ⟧ = if not e then [] else ⟦ stmts ⟧)
⟦ ⟧ = [returnExp]

Do-notation (Revisited)

do { stmt_1; …; stmt_{n-1}; e_n } :: m t_n

Each stmt takes one of three forms, where m is some Monad type and is the same m for each bind-statement in the block:

let p_i = e_iwhere e_i :: t_i for some t_i and thus p_i :: t_i
p_i <- e_iwhere e_i :: m t_i for some t_i and thus p_i :: t_i
e_iwhere e_i :: m t_i for some t_i

A do-block gets desugared left-to-right as follows:

⟦ let p = e, stmts ⟧ = let p = e in ⟦ stmts ⟧
⟦ p <- e, stmts ⟧ = e >>= (\p -> ⟦ stmts ⟧)
⟦ e, stmts ⟧ = e >> ⟦ stmts ⟧ = e >>= (\_ -> ⟦ stmts ⟧)
⟦ e_n ⟧ = e_n -- the last statement in the block

List Comprehensions (Revisited)

A list comprehension

[ returnExp | stmt_1 , … , stmt_n ] :: [] returnTyp

gets desugared to

do { ⟦ stmt_1 ⟧; …; ⟦ stmt_n ⟧; pure returnExp } :: [] returnTyp

where its statements are desugared:

⟦ let x = e ⟧ = let x = e
⟦ x <- xs ⟧ = x <- xs
⟦ e ⟧ = guard e

Notice where this translation inserts pure and guard. Translating the resulting do-block results in the first desugaring of list comprehensions shown above.