Pattern matching is an elegant, useful feature of modern programming language design. Pattern matching is part of common idioms in functional programming languages including Standard ML, Haskell and their derivatives. Pattern matching is also part of the design of Typed Racket.
Programs that make use of pattern matching are cleaner, easier to read and more pleasant to work with than their pattern-free counterparts. Pattern matching is not addressed in How to Design Programs, but since it offers so many advantages, and no disadvantages, and is so well-suited to the style of programming taught in this course, we include it in our curriculum.
Typed Racket’s pattern matching offers a rich collection of syntactic forms, only some of which are included in this document. Note that you will find many other pattern matching forms in the language documentation, but, for the purposes of this course, you only need a few of the available forms.
Pattern matching basics
As conditional expressions are written with cond
, pattern matching
expressions are written with match
. A match expression has the
following general form:
(match exp_0
[pat_1 exp_1]
[pat_2 exp_2]
...
[pat_k exp_k])
Match
evaluation is similar to cond
evaluation. First the
expression exp_0
is evaluated to a value we’ll call v
. Then v
is
compared against each pattern in turn, in the order given: pat_1
,
then pat_2
, and so on, up to pat_k
. If v
matches pat_1
, then
the value of the whole expression is exp_1
, and evaluation within
the match
goes no further. If pat_1
doesn’t match, then v
is
checked against pat_2
, and the value is exp_2
. Evaluation
continues from pattern to pattern until a match is found (or none is).
Consider this match expression:
(match (+ 1 1)
[2 "A"]
[3 "B"])
In this case, (+ 1 1)
is evaluated to 2
. This is compared first
against the first pattern in order, which is a constant pattern
2
. This is a match, so the value of the whole expression is the
string "A"
. With the following alteration, the value of the
expression is "B"
:
(match (+ 1 2)
[2 "A"]
[3 "B"])
There are many useful kinds of patterns besides constant patterns: one such kind is variable patterns. A variable pattern consists of a variable name, and a) always succeeds and b) gives a name to the value matched. Consider this example:
(match (* 10 3)
[n (+ n 1)])
The value of this expression is 31. Why? The expression immediately
following match
evaluates to 30. Then 30 is compared against the
variable pattern n
, which succeeds in matching (as variable patterns
always do) and furthermore binds the name n
to the expression’s
value. When the expression (+ n 1)
is evaluated, n
is bound to 30,
and we finally arrive at 31.
Because variable patterns always match, they can be used as catch-all
or default patterns in a pattern match, playing a role not entirely
unlike what else
does in the last branch of a cond
. For example:
(match (+ 8 9)
[0 "zero"]
[n "not zero"])
In this expression, the match of evaluated 17 against the constant
pattern 0
fails. The match against n
necessarily succeeds, and the
value of the expression altogether is the string "not zero"
.
The variable pattern n
in the last example is a suitable candidate
for a wildcard pattern, written with the underscore character
_
. Recall that a variable pattern both matches a value and binds a
name to it, but in this case (and many cases in practice) the name is
not needed, since n
is not subsequently used. In such cases as this,
a wildcard pattern, informally known as a "don’t care" pattern, is a
good choice. A wildcard pattern always succeeds in matching, like a
variable pattern does, but doesn’t bind; that is, it associates no
name with the matched value.
(match (+ 8 9)
[0 "zero"]
[_ "not zero"])
The underscore serves not only to catch any expression that falls past
the constant pattern 0
, but it also expresses clearly that the
particular value of the matching expression is of no interest, only
the fact that it did not match any previous pattern.
Matching structures and nested patterns
Matching against structures is a convenient way to, first, identify the particular struct under consideration, and also to name the individual components in the structure and thereby eliminate calls to selectors.
Define a point in 2-space as follows:
(define-struct Point
([x : Real]
[y : Real]))
Now assume there exists an expression of type Point
bound to a
variable p
, and consider the following match:
(match p
[(Point x y) (+ x y)])
This expression performs the (admittedly contrived) operation of
adding together the x
and y
components of the point p
. The pattern
(Point x y)
serves several purposes: Point
ensures that p
is
actually a Point
struct, and furthermore the names x
and y
are
bound to its two components. Syntactically, this is a real bargain,
although it will take practice and experience to appreciate
why. Without pattern matching, a similar expression looks like this:
(+ (Point-x p) (Point-y p))
On the whole, the latter expression is no longer (in number of
characters) than the first; however, in the part of the expression
where the action is — namely, where x
and y
are added together — the former expression is a clean (+ x y)
, while the latter is
cluttered up with selectors (Point-x
and Point-y
). Pattern
matching often has this clarifying effect by separating the
deconstruction of a compound value from computing with its elements.
Using nested patterns with nested structures pushes the advantages of structure matching still further. Consider a line segment structure containing two points as follows:
(define-struct Seg
([a : Point]
[b : Point]))
Now we consider the following (contrived) computation: to subtract the product of the segments y components from the product of its x components. We can write the expression with nested patterns like this:
(match s
[(Seg (Point ax ay) (Point bx by)) (- (* ax bx) (* ay by))])
Compare that expression to this one that uses selectors:
(- (* (Point-x (Seg-a s)) (Point-x (Seg-b s)))
(* (Point-y (Seg-a s)) (Point-y (Seg-b s))))
In the second expression, the selectors overwhelm the syntax and make
reading and understanding the expression (comparatively) difficult; the matched
alternative above (- (* ax bx) (* ay by))
is crystal clear by
comparison.
Note also that this second example of pattern matching illustrates that the
variable names to which values from within a structure are bound do not need
to have the same names as the fields. While in the first example, we found
it convenient to use x
as the name for the contents of the x
field of
the structure, in the second example, we chose the names ax
and bx
.
Pattern matching in function definitions
Pattern matching is especially useful in the definition of functions. Here is a function definition using only selectors:
(: q1? : Point -> Boolean)
;; test whether the point is in the first quadrant
(define (q1? p)
(and (positive? (Point-x p))
(positive? (Point-y p))))
When we rewrite the function using a match
, the code is not shorter,
but its main expression is free of selectors and hence clearer:
(: pat-q1? : Point -> Boolean)
;; reimplementation of q1? using pattern matching
(define (pat-q1? p)
(match p
[(Point x y)
(and (positive? x)
(positive? y))]))
With nested structures, nested pattern matching cleans up and clarifies the computation to an even greater extent. The following two functions compute the same property of a line segment; compare the syntax of one to the other.
(: p1 : Seg -> Boolean)
(define (p1 s)
(< (* (Point-x (Seg-a s)) (Point-y (Seg-b s)))
(* (Point-x (Seg-b s)) (Point-y (Seg-a s)))))
(: p2 : Seg -> Boolean)
(define (p2 s)
(match s
[(Seg (Point ax ay) (Point bx by))
(< (* ax by) (* bx ay))]))
Matches and disjoint unions
Recall the definition of the Shape
type from Chapter 5.
(define-type Shape (U Circle Rect))
We can use pattern matching to write functions that work over union types like Shape
.
For example, the shape-color
function can be written using pattern matching as
(: shape-color : Shape -> Color)
;; Get the color of a shape.
;;
(define (shape-color shape)
(match shape
[(Circle _ _ col) col]
[(Rect _ _ _ col) col]))
We use wild cards for the parts of the shapes that we are not interested in.
Matching multiple expressions with match*
Typed Racket also provides a mechanism for matching against multiple values with multiple patterns. For example, say we wanted a function to test if two shapes are the same shape and color. We could write that as
(: same-shape-and-color? : Shape Shape -> Boolean)
;; test if two shapes are the same kind and have the same color.
;;
(define (same-shape-and-color? shape1 shape2)
(match* (shape1 shape2)
[((Circle _ _ col1) (Circle _ _ col2)) (symbol=? col1 col2)]
[((Rect _ _ _ col1) (Rect _ _ _ col2)) (symbol=? col1 col2)]
[(_ _) #f]))
Pattern-matching gotchas
You have to be careful to avoid using the identifiers true
, false
, and empty
in patterns. These identifiers are variables (not literals), which means that
they will match any value.
For example, the following function will always return false
.
(: flip : Boolean -> Boolean)
(define (flip b)
(match b ([true false] [false true])))
What happens here is that the value of b
is assigned to a new variable named true
,
rather than being matched to the value true
. The expression then results in false
,
the expression part of the first match.
The correct way to write it is to use the #f
and #t
literals in
the patterns:
(: flip : Boolean -> Boolean)
(define (flip b)
(match b [#t false] [#f true]))
Another common mistake with pattern matching is assuming that a successful match induces type information the way that type filters for type tests do. For example, the following function will result in a type error
(: shape-color : Shape -> Color)
;; Get the color of a shape.
;;
(define (shape-color shape)
(match shape
[(Circle _ _ _) (Circle-color shape)]
[_ (Rect-color shape)]))
Even though matching shape
against the Circle
pattern is only possible if shape
has
the Circle
type, the flow-sensitivity of Typed Racket’s type system does not extend to
pattern matching. Likewise, even though failing to match the Circle
pattern is only
possible if shape
has type Rect
, the Typed Racket type checker is not clever enough
to realize that fact.