Defining more types

In addition to the base types that are built in to Typed Racket, such as Integer, Boolean, String, the language includes mechanisms for defining your own types. You can define custom data structures with define-struct, and also create and use product types and sum types as needed.

Product and sum types are not unique to Racket; they are standard constructions in the theory and practice of programming languages. Loosely speaking, product types represent this and that, and sum types represent this or that. For example, a product type can be used to represent a value that is "an integer and a boolean", while a sum types can represent values that are "an integer or a boolean." Sum types are discussed below, in the discussion of union types; product types are not addressed in this chapter.

Structure Definitions

Typed Racket provides a special syntactic form, define-struct, for defining custom data structures. (See Chapter 6 of HtDP.)

A define-struct definition starts with the name of the data structure being defined, followed by an enumeration of the field names of the components of the structure, and their types. Here is a specific example:

(define-struct Point
  ([x : Real]
   [y : Real]))

By convention, the name of a structure is capitalized, as here. Also, be aware that the order in which the field specifications appears is significant.

In general, the definition of a new structure S looks like this:

(define-struct S
  ([field_1 : Type_1]
   [field_2 : Type_2]
   ...
   [field_k : Type_k]))

When using define-struct in stock Typed Racket, you would also write #:transparent immediately before the final parenthesis. This is required to allow the use of check-expect and to be able to see the contents of structures when they are printed out in the REPL; in stock Typed Racket, the default is for structures to be opaque, meaning that some code cannot see their contents. Opacity is actually a good programming practice that enforces abstraction across the boundaries of different modules of code, preventing one set of code from observing or tampering with internal details of another chunk of code. But, for the purposes of this class, it is more of an impediment. Thus, we have modified the default behavior to make transparent structures automatically. If you encounter someone who took the course in 2014, be aware they will have used the standard #:transparent directive that is no longer needed or allowed.

Each structure definition defines a new type named whatever appears immediately after the define-struct; in the point example above, it defines a new type Point. That new type can be used wherever types are used, including in type ascriptions of values and functions, and in field specifications of other structure definitions. It also generates the definitions of a number of operations that either produce or consume structure of the new type. For example, the definition of Point generates functions with the following types:

(: Point   : Real Real -> Point)
(: Point-x : Point -> Real)
(: Point-y : Point -> Real)
(: Point?  : Any -> Boolean : Point)

The function Point is known as a value constructor, since it constructs, or puts together, data structures of type Point. The next two operations are known as selectors, for taking Points apart into their constituent components.

Note that you can write also make-Point in place of Point, which is what the textbook does. The two functions are equivalent. You may prefer the make-Point form, since it avoids confusion with type names, but we will use the shorter form in this document.

You may notice that the type of Point? has three parts (Any, Boolean, and Point). The first two are the same as any other function type and indicate that the predicate takes any value and returns a boolean. The third part, after the colon, is a filter that tells the typechecker two things:

If an application of Point? evaluates to true, its argument variable has type Point.
If the application of Point? evaluates to false, its argument variable does not have type Point.

Predicates for all built-in types are annotated with similar filters that allow the type system to reason about predicate checks. This mechanism is called occurrence typing and allows functions, such as the following, to pass the type checker:

(: point-x? : Any -> Real)
(define (point-x? p)
  (if (Point? p) (Point-x p) 0))

since the typechecker knows that p has type Point in the true branch of the if. This mechanism is restricted to arguments that are variables.

Note that while you will use functions with these occurrence types, you will not need to write any of your own.

Union Types

Union types provide a mechanism to define a type that belongs to one of several types. Union types are constructed with the built-in type operator U and look like this:

(: U T_1 T_2 ... T_k)

for types T_i.

It is helpful to think of types as sets to understand the use of the term union here. The following union type

(: U Integer Boolean)

is the union of all integers and all (both) booleans. A value of this type would therefore need to be either an Integer or a Boolean, which is to say it would need to inhabit the union of the set of integers and the set of booleans. Read through this interaction transcription to see how this plays out.

> (: x (U Integer Boolean))
> (define x 10)
> x
- : (U Boolean Integer)
10
> (define x #t)
> x
- : (U Boolean Integer)
#t
> (define x "hello!")
Type Checker: type mismatch
  expected: (U Boolean Integer)
  given: String in: "hello!"

Sum types, mentioned at the top of this page, can be defined in Typed Racket as union types. In the technical literature, a sum type is the disjoint union of two or more types, meaning a value in the type (U T_1 T_2) must be a value either of type T_1 or T_2, but not both. Typed Racket does not enforce this; types such as (U Positive-Integer Integer) and even (U Integer Integer) are legal. Therefore, sum types are not synonymous with union types, but rather one of the ways union types can be used in Typed Racket.

Let’s work through an extended example. We might define a representation of colored circles, where the available colors are red, green, or blue. We can use symbols to represent the distinct colors; i.e., 'red, 'green, and 'blue. A symbol in Typed Racket is its own type, as illustrated by the following code snippet:

(: R : 'red)
(define R 'red)

This kind of type is called a singleton type, since it is a type that has exactly one value. By combining multiple singleton types in a union, we can describe enumeration types. For example, our color type is the union of 'red, 'green, and 'blue. Two other useful singleton types are #t and #f (as one might guess, Boolean is the union of #t and #f).

We can define a function that maps colors to their names as follows:

(: color-to-string : (U 'red 'green 'blue) -> String)
;; convert a color to a string
(define (color-to-string c)
  (cond
    [(symbol=? c 'red) "red"]
    [(symbol=? c 'blue) "blue"]
    [else "green"]))

Notice that the declared type of color-to-string guarantees that the else clause of the conditional will always correspond to green.

The order in which types appear in a union does not matter. For example, the type (U 'a 'b) is the same as the type (U 'b 'a).

Our definition of colored circles is as follows:

(define-struct Circle
  ([center : Point]
   [radius : Real]
   [color : (U 'red 'green 'blue)]))

Now we can define a blue unit circle centered at (2, 3) as follows:

(: blue-circ : Circle)
(define blue-circ
  (Circle (Point 2 3) 1 'blue))

We can also define unions of all kinds of types. For example, we might have a representation of colored rectangles:

;; colored axis-aligned box
(define-struct Rect
  ([center : Point]
   [width : Real]
   [height : Real]
   [color : (U 'red 'green 'blue)]))

and then define a function for getting the color from a shape

(: shape-color : (U Circle Rect) -> (U 'red 'green 'blue))
;; get the color of a shape
(define (shape-color shape)
  (cond
    [(Circle? shape) (Circle-color shape)]
    [else (Rect-color shape)]))

This function is an example of the use of occurrence typing in a conditional. In the first clause of the conditional, the Circle? predicate determines that the shape variable has the type Circle in the expression part and does not have that type in the rest of the conditional. Because the typechecker knows that the variable shape must be either a Circle or a Rect, it knows that the type of shape must be Rect in the else clause.

Type definitions

Since type expressions can get quite large and complicated, it is helpful to introduce names for such types. This objective fits nicely with the concept of a data definition that is introduced in the textbook (Section 6.4).

For example, instead of writing

;; A color is one of 'red, 'blue, or 'green.

we can give an explicit definition using the define-type construct

(define-type Color (U 'red 'blue 'green))

We can also give a name to our shape type:

(define-type Shape (U Circle Rect))

Subtyping

Union types give rise to a notion of subtyping; i.e., the idea that one type is contained in another. For example, the type 'red is contained in the type Color, so we say that 'red is a subtype of Color. Intuitively, you can think of subtyping in Typed Racket as corresponding to the subset relation of the sets of values in the types. For example, the value 'red is contained in the set of Color values {'red, 'green, 'blue}.

Another example of subtyping that you have already experienced is the hierarchy of numeric types. The type Natural is a subtype of Integer, Integer is a subtype of Exact-Rational, etc.

Subtyping comes into play when Typed Racket checks the types of arguments against the type of a function in an application. If an argument expression has a type that is a subtype of the expected type, then it is okay. For example, when checking the expression

(color-to-string 'red)

we have an argument of type 'red, which is a subtype of the expected type.

Last updated 2020-01-24 15:55:22 -0600

Table of Contents | Back to CS151 Home