This document is a work-in-progress that will be updated during Winter 2023.
Haskell has many syntactic forms, which can be daunting when starting out. Here is an incomplete description of Haskell syntax, largely (but not entirely) in the order in which they are encountered in the lecture notes. The primary purpose of this reference is to develop intuitions about the different building blocks — thus, some minor or technical details deviate from the actual language definition and implementation.
Syntactic Form | Color | |
---|---|---|
e | expression | evergreen |
t | type | tan |
v | value | |
x | variable | violet |
p | pattern | pink |
k | kind | |
… | (other) |
> e
v
> :t e
t
> :k t
k
The following lists include most of the syntactic forms encountered in Introduction and Lists.
An expression e
can take many forms,
including:
()
True , False
‘C’ , “MSC” , …
1 , 2 , 3.14 , …
(e_1,
…, e_n)
[e_1,
…, e_n]
e_1
e_2
e_1
binop e_2
(
binop)
e_1 e_2
let x =
e_1 in e_2
x
case e_1 of { True ->
e_2; False
-> e_3 }
case e_1 of { Just x -> e_2; Nothing ->
e_3 }
case e_1 of { (x:xs) ->
e_2; [] -> e_3 }
e ::
t
A type t
can take many forms,
including:
() , Bool , Char , String , Int , Integer , Double , …
(_t_1,
…, _t_n)
[_t]
_t_1
-> … -> _t_n
{- forall
a b …
. -}
_t_1 -> … ->
_t_n
a
By default in Haskell, all universally quantified variables in a type
t must appear all the way to the left —
that is, types must be in prenex normal
form. Thus, we write the metavariable
_t
(with an underscore) to range over all
type forms except universal quantification.
e_2 where
x = e_1
=
let x = e_1 in e_2
if e_1 then
e_2 else e_3
=
case e_1 of {
True -> e_2; False ->
e_3 }
f
x_1 … x_n
=
| pred_1 = e_1
…
| pred_n = e_n
f x_1 … x_n =
if pred_1 then e_1
else if …
else if pred_n then e_n
else undefined
f
p_11 … p_1m = e_1
=
…
f p_n1 … p_nm = e_n
f
x_1 … x_n =
case (x_1, …,
x_m) of
(p_11, …, p_1m)
-> e_1
…
(p_n1, …, p_nm)
-> e_n
module ModuleName where
import ModulePath_1
…
name_1 :: type_1
name_1 = exp_1
…
func_1 :: argType_1 -> argType_m -> retType_1
func_1 arg_11 …
arg_1m = funcBody_1
func_1 arg_n1 …
arg_nm = funcBody_n
…
main :: IO ()
main = mainExp
do { stmt_1; …; stmt_n; e }
Each stmt
takes one of three
forms:
let
x_i = e_i
where e_i ::
t_i for some t_i
and thus x_i ::
t_i
x_i
where e_i <-
e_i::
IO
t_i for some t_i
and thus x_i ::
t_i
e_i
where
e_i ::
IO
t_i for some t_i
The first kind of statement is an ordinary (side-effect free)
let
-binding. Notice that let
-bindings within a
do
-block do not use the keyword in
.
The second kind of statement runs a (potentially side-effecting)
action e_i of some IO
t_i type; the result of type t_i is bound to the name x_i.
The third kind of statement is just like the second, except that the
result is not bound to a name for subsequent reference. It is syntactic
sugar for _
<-
e_i, where
_ is a wild-card pattern.
The type of each stmt can be, and often is, different.
The final statement must be an expression e; its type IO
t is the overall type of the entire
do
-block.
type T
= C_1 t_11
t_12 …
| C_2 t_21
t_22 …
| …
deriving (Class_1, Class_2, …)
case e of {
p_1 -> e_1; …; p_n
-> e_n; }
Each pattern p
takes one of three
forms:
_
type T a b …
= C_1 …
| C_2 …
| …
type T …
= C_1 …
| C_2 …
| …
deriving (Class_1, Class_2, …)
let p = e_1 in e_2
≈
case e_1 of p -> e_2
Notice the approximately equals sign: this rewriting works only for non-recursive definitions.
\
x -> e
= \
p -> e
\
x -> case x of p -> e
= (
n binop)
\
m -> n binop m
= (
binop m)
\
n -> n binop m
[ returnExp |
stmt_1 , … , stmt_n ]
Each stmt
, separated by commas,
takes one of three forms:
let x = e
let-binding
x <- xs
“iterate” over the list
xs
e
filter according to
Bool
ean predicate
A list comprehension gets desugared as follows
[ returnExp |
stmt_1 , … , stmt_n ]
= ⟦ stmt_1
, … , stmt_n ⟧
where its statements are desugared left-to-right:
⟦
let x = e, stmts
⟧ =
let x = e in ⟦ stmts
⟧
⟦
x <- xs, stmts
⟧ =
flip concatMap e (\x -> ⟦
stmts ⟧)
⟦
e
,
stmts ⟧ = if not e then [] else ⟦ stmts ⟧)
⟦ ⟧ = [returnExp]
do { stmt_1; …; stmt_{n-1}; e_n }
::
m t_n
Each stmt
takes one of three
forms, where m is some Monad
type and is the same m for each
bind-statement in the block:
let
p_i = e_i
where e_i ::
t_i for some t_i
and thus p_i ::
t_i
p_i
where e_i <-
e_i::
m t_i for some
t_i and thus p_i ::
t_i
e_i
where
e_i ::
m t_i for some
t_i
A do
-block gets desugared left-to-right as follows:
⟦
let p = e, stmts
⟧ =
let p = e in ⟦ stmts
⟧
⟦
p <- e, stmts
⟧ =
e
>>=
(\p -> ⟦ stmts ⟧)
⟦
e
,
stmts ⟧ = e
= >>
⟦ stmts ⟧ e
>>=
(\_ -> ⟦ stmts ⟧)
⟦ e_n ⟧ = e_n
-- the last statement in the block
A list comprehension
[ returnExp |
stmt_1 , … , stmt_n ]
::
[] returnTyp
gets desugared to
do { ⟦ stmt_1 ⟧; …; ⟦ stmt_n ⟧;
pure
returnExp } ::
[] returnTyp
where its statements are desugared:
⟦
let x = e
⟧ = let x = e
⟦
x <- xs
⟧ = x <- xs
⟦
e
⟧ = guard e
Notice where this translation inserts pure
and
guard
. Translating the resulting do
-block
results in the first desugaring of list comprehensions shown above.