Introduction to ML in Elm

We will be using Elm v0.19.1. If there are minor language revisions released throughout the quarter, we will decide whether or not to upgrade. You should get Elm up and running as soon as possible to make sure that you have a working development environment.

Let's jump in with some examples at the REPL (read-eval-print loop).

% elm repl
--- REPL 0.19.1 ----------------------------------------------------------------
Say :help for help and :exit to exit! More at <https://elm-lang.org/0.19.1/repl>
--------------------------------------------------------------------------------
>

Basic Values

> True
True : Bool

> False
False : Bool

> 'a'
'a' : Char

> "abc"
"abc" : String

> 3.0
3 : Float

Numeric literals without a decimal point are described by the type variable number, which describes both Ints and Floats.

> 3
3 : number

One way to read the last line above is "for every type number such that number = Int or number = Float, 3 has type number." In other words, "3 has type Int and Float" and depending on how the expression is used, the Elm type checker will choose to instantiate the type variable number with one of these types.

> truncate
<function: truncate> : Float -> Int

> truncate 3
3 : Int

> truncate 3.0
3 : Int

(Note: If you are familiar with Haskell, think of number as a type class that is "baked in" to the language. Elm does not have general support for type classes, but it does have a few special purpose type classes like number.)

Tuples

Tuples package two or three expressions into a single expression. The type of a tuple records the number of components and each of their types.

> (True, False)
(True,False) : ( Bool, Bool )

> (1, 2, 3.0)
(1,2,3) : ( number, number1, Float )

Notice the suffix on the type of the second number. That's because the expressions 1 and 2 both have type number (i.e. Int or Float) but they may be different kinds of numbers. So, suffixes are used to create different variables so that each numeric type can be specified independently.

(Note: In Haskell, the type of this triple would be something like (Num a, Num b) => (a, b, Float). This can be read as saying "for any types a and b that are numbers, the tuple has type (a, b, Float)."

Lone expressions prefer to remain alone:

> ("Leave me alone!")
"Leave me alone!" : String

> (((((("Leave me alone!"))))))
"Leave me alone!" : String

Four or more expressions...

> (1, 2, 3, 4)
-- BAD TUPLE -------------------------------------------------------------- REPL

I only accept tuples with two or three items. This has too many:

4|   (1, 2, 3, 4)
     ^^^^^^^^^^^^
I recommend switching to records. Each item will be named, and you can use the
`point.x` syntax to access them.

Note: Read <https://elm-lang.org/0.19.1/tuples> for more comprehensive advice on
working with large chunks of data in Elm.

... must be packaged into records (discussed below), or nested tuples.

Functions

Like in most functional languages, all functions take exactly one argument and return exactly one value.

> exclaim = \s -> s ++ "!"
<function> : String -> String

> exclaim s = s ++ "!"
<function> : String -> String

> exclaim "Hi"
"Hi!" : String

Multiple arguments in uncurried style:

> plus = \(x,y) -> x + y
<function> : ( number, number ) -> number

> plus (x,y) = x + y
<function> : ( number, number ) -> number

> plus xy = Tuple.first xy + Tuple.second xy
<function> : ( number, number ) -> number

Notice the lack of suffixes in the types above. That's because the addition operator takes two numeric arguments of the same type:

> (+)
<function> : number -> number -> number

Infix operators can be used as functions:

> (+) 3 4
7 : number

> (+) ((+) 3 4) 5
12 : number

(Note to Haskellers: Elm disallows backticks to treat named functions into infix operators, as well as a couple other syntactic features originally derived from Haskell.)

Multiple arguments in curried style:

> plus x y = x + y
<function> : number -> number -> number

> plus x = \y -> x + y
<function> : number -> number -> number

> plus = \x -> \y -> x + y
<function> : number -> number -> number

> plus = \x y -> x + y
<function> : number -> number -> number

Partial application of curried functions:

> plus7 = plus 7
<function> : number -> number

> plus7 1
8 : number

> plus7 11
18 : number

(Note to Haskellers: Elm does not support sections.)

What if we wanted to restrict our plus function to Ints rather than arbitrary numbers? We need some way to "cast" a number to an Int. Although the Basics library does not provide such a toInt function, we can define something to help ourselves:

> toInt n = n // 1
<function> : Int -> Int

This doesn't quite have the type number -> Int we sought... but on second thought, we don't really need our casting function to have that type. Why not?

> plusInt x y = (toInt x) + y
<function> : Int -> Int -> Int

> plusInt x y = toInt (x + y)
<function> : Int -> Int -> Int

Type Annotations

Elm, like most ML dialects, automatically infers most types. Nevertheless, it is often good practice to explictly declare type annotations for "top-level" definitions (we will see "local" definitions shortly).

In an Elm source file (e.g. IntroML.elm), a top-level definition can be preceded by a type annotation.

Take a look at these notes that explain how to use elm init to set up an Elm project directory for source code files. Then, after saving IntroML.elm into the src/ directory, you should be able to run elm make src/IntroML.elm — this will parse and type check the file, but it won't generate anything because we haven't seen main definitions yet.

Now that we've started putting definitions in source files, how do we import them from the REPL and from other files?

> import IntroML
>

The type checker will check whether the implementation actually satisfies the type you've declared.

plus : number -> number -> number
plus x y = x + y

plusInt : Int -> Int -> Int
plusInt x y = x + y

Notice that by using an explicit annotation for plusInt, we avoid the need to use the roundabout toInt function from before. In fact, we can refactor the definition as follows:

plusInt : Int -> Int -> Int
plusInt = plus

This version really emphasizes the fact that our implementation of plusInt is more general than the API (i.e. type) exposed to clients of the function. Designing software is full of decisions like this one.

There's nothing stopping us from writing programs where the expressions we write do not satisfy the type signatures we write:

plus : number -> number -> Bool
plus x y = x + y

When we do, Elm reports helpful error messages explaining the inconsistencies:

> import IntroML
-- TYPE MISMATCH --------------------------------------------------- IntroML.elm

Something is off with the body of the `plus` definition:

8| plus x y = x + y
              ^^^^^
The body is:

    number

But the type annotation on `plus` says it should be:

    Bool

Hint: The `number` in your type annotation is saying that ints AND floats can
flow through, but your code is saying it specifically wants a `Bool` value.
Maybe change your type annotation to be more specific? Maybe change the code to
be more general?

Read <https://elm-lang.org/0.19.1/type-annotations> for more advice!

Importing Modules

The following import will require all imported definitions to be qualified for use.

> import IntroML

> IntroML.plusInt 2 3
5 : Int

> plusInt 2 3
-- NAMING ERROR ----------------------------------------------------------- REPL

I cannot find a `plusInt` variable:

5|   plusInt 2 3
     ^^^^^^^
These names seem close though:

    asin
    sin
    ceiling
    min

Hint: Read <https://elm-lang.org/0.19.1/imports> to see how `import`
declarations work in Elm.

Another option is to specify which definitions to import for use without qualification. All other definitions from IntroML will still be accessible with qualification.

> import IntroML exposing (plusInt)

> plusInt 2 3
5 : Int

> IntroML.plus 2.0 3.0
5 : Float

> IntroML.exclaim "Cool"
"Cool!" : String

You can also import all definitions for use without qualification.

> import IntroML exposing (..)

> (plusInt 2 3, exclaim "Cool")
(5,"Cool!") : ( Int, String )

Finally, you can also define an abbreviation for the imported module.

> import IntroML as M

> M.plusInt 2 3
5 : Int

Whew, that was a lot of choices! This kind of flexibility will come in handy, because it can be hard to remember where functions are defined when importing many modules. Furthermore, many modules will define functions with popular names, such as map and foldr, so qualified access will be needed.

You may have noticed that we have been using some library functions without any imports. That's because Basics, as well as a few other very common libraries such as Maybe, are opened by default.

Hot-Swapping

If you change the following definition in IntroML.elm to append additional exclamation points...

exclaim s = s ++ "!!!"

... you will immediately have access to the new version without having to first import the module again.

> M.exclaim "Whoa"
"Whoa!!!" : String

This kind of hot-swapping can be useful once we get to writing and running more interesting programs.

Conditionals

Conditional expressions must return the same type of value on both branches.

> if 1 == 1 then "yes" else "no"
"yes" : String

> if False then 1.0 else 1
1 : Float

(Note to Racketeers: Even if you know for sure that returning different types of expressions on different branches will jive with the rest of your program, Elm will not let you do it. You have to use union types, discussed below. Restrictions like this may sometimes annoy the programmer. But in return, they enable the type system to provide static checking error detection that becomes really useful, especially as programs get large.)




Introduction to ML in Elm (Continued)

Polymorphic Types

Type variables are identifiers that start with a lower case letter and are often a single character.

> choose b x y = if b then x else y
<function> : Bool -> a -> a -> a

As with the number type discussed above, this function type should be read as having an implicit "forall" at the beginning that "defines" the scope of the type variable: "for all types a, choose has type Bool -> a -> a -> a.

When calling a polymorphic function such as choose, Elm (like other ML dialects) will automatically instantiate the type variables with type arguments appropriately based on the value arguments.

> choose True True False      -- a instantiated to Bool
> choose True "a" "b"         -- a instantiated to String
> choose True 1.0 2.0         -- a instantiated to Float
> choose True 1 2             -- a instantiated to number

These function calls can be thought of as taking type arguments (one for each type universally quantified type variable for the function) that are automatically inferred by the type checker. If the syntax of Elm were to allow explicit type instantiations (like Java, for example), the above expressions might look something like:

choose <Bool> True True False
choose <String> True "a" "b"
choose <Float> True 1.0 2.0
choose <number> True 1 2

Imagine that polymorphic types in Elm required an explicit forall quantifier. The result of instantiating a polymorphic type with a type argument T is obtained by substituting bound occurrences of the type variable with T.

choose : forall a. Bool -> a      -> a      -> a

choose <Bool>    : Bool -> Bool   -> Bool   -> Bool
choose <String>  : Bool -> String -> String -> String
choose <Float>   : Bool -> Float  -> Float  -> Float
choose <number>  : Bool -> number -> number -> number

Just as the particular choices of program variables do not matter, neither do the particular choices of type variables. So polymorphic types are equivalent up to renaming. For example, choose can be annotated with polymorphic types that choose a different variable name than a.

choose : Bool -> b -> b -> b 

choose : Bool -> c -> c -> c 

choose : Bool -> thing -> thing -> thing

What happens if choose is annotated as follows?

choose : Bool -> number -> number -> number

The choose function typechecks with this annotation, but this type is more restrictive than the earlier ones. Remember that number, as discussed earlier, can only be instantiated with the types Int and Float. This special handling of the particular variable number — as opposed to other identifiers — is the way that Elm shoehorns a limited form of type classes into the language.

While we are on the subject, there is another special purpose type variable called comparable that is used to describe types that are, well, comparable using an ordering relation. See Basics for more info.

> (<)
<function> : comparable -> comparable -> Bool

> 1 < 2
True : Bool

> 1 < 2.0
True : Bool

> "a" < "ab"
True : Bool

> (2, 1) < (1, 2)
False : Bool

> (1 // 1) < 2.0
-- TYPE MISMATCH ---------------------------------------------------------- REPL
...

> True < False
-- TYPE MISMATCH ---------------------------------------------------------- REPL
...

Hint: I do not know how to compare `Bool` values. I can only compare ints,
floats, chars, strings, lists of comparable values, and tuples of comparable
values.

Infix Operators

There are a bunch of really useful infix operators in Basics, so take a look around. Make sure to visit (<|), (|>), (<<), and (>>), which can be used to write elegant chains of function applications.

Depending on your prior experience and tastes, you may prefer to write the expression

\x -> h (g (f x))

in a flavor that emphasizes composition, such as

(\x -> x |> f |> g |> h)

or

(f >> g >> h)

or

(\x -> h <| g <| f <| x)

or

(h << g << f)

or

(\x -> (g >> h) <| f <| x)

or

(\x -> x |> f |> (h << g))

All of these definitions are equivalent, so choose a style that you like best and that fits well within the code around it. (But you better not choose versions like the last two, because "pipelining" in both directions won't help anyone, including yourself, understand your code.)

Lists

Without further ado, lists.

> 1::2::3::4::[]
[1,2,3,4] : List number

> [1,2,3,4]
[1,2,3,4] : List number

For those keeping score, the list syntax above is part OCaml ((::) for cons rather than (:)) and part Haskell (, to separate elements rather than ;).

Strings are not lists of Chars like they are in Haskell:

> ['a','b','c'] == "abc"
-- TYPE MISMATCH ---------------------------------------------------------- REPL

I need both sides of (==) to be the same type:

3| ['a','b','c'] == "abc")
   ^^^^^^^^^^^^^^^^^^^^^^
The left side of (==) is:

    List Char

But the right side is:

    String

Different types can never be equal though! Which side is messed up?

Pattern matching to destruct lists. The \ character is used to enter a multi-line expression in the REPL...

> len xs = case xs of \
|   x::xs -> 1 + len xs \
|   []    -> 0

... but note that shadowing is not allowed:

-- SHADOWING -------------------------------------------------------------- REPL

The name `xs` is first defined here:

3| len xs = case xs of
       ^^
But then it is defined AGAIN over here:

4|   x::xs -> 1 + len xs
        ^^
Think of a more helpful name for one of them and you should be all set!

Note: Linters advise against shadowing, so Elm makes “best practices” the
default. Read <https://elm-lang.org/0.19.1/shadowing> for more details on this
choice.

No problem, let's try again:

> len xs = case xs of \
|   x::rest -> 1 + len rest \
|   []      -> 0
<function> : List a -> number

> len [1,2,3]
3 : number

> len []
0 : number

(Note to Racketeers: The first branch of the case expression above essentially combines the functionality of checking whether pair? xs is #t and, if so, calling car xs and cdr xs.)

Non-exhaustive patterns result in a (compile-time) type error:

> head xs = case xs of x::_ -> x

-- MISSING PATTERNS ------------------------------------------------------- REPL

This `case` does not have branches for all possibilities:

3| head xs = case xs of x::_ -> x
             ^^^^^^^^^^^^^^^^^^^^
Missing possibilities include:

    []

I would have to crash if I saw one of those. Add branches for them!

Hint: If you want to write the code for each branch later, use `Debug.todo` as a
placeholder. Read <https://elm-lang.org/0.19.0/missing-patterns> for more
guidance on this workflow.

If you really must write a partial function:

    > unsafe_head xs = case xs of \
    |   x::_ -> x \
    |   []   -> Debug.todo "unsafe_head: empty list"
    <function> : List a -> a

    > unsafe_head [1]
    1 : number

    > unsafe_head []
    ... Error: Ran into a `Debug.todo` ...

Using Debug.todo as a "placeholder" during development is extremely useful, so that you can typecheck, run, and test your programs before you have finished handling all cases. (Check out the type of Debug.todo.)

Elm also statically rejects programs with a redundant pattern, which will never match at run-time because previous patterns subsume it:

> len xs = case xs of \
|   _::rest -> 1 + len rest \
|   []      -> 0 \
|   []      -> 9999

-- REDUNDANT PATTERN ------------------------------------------------------ REPL

The 3rd pattern is redundant:

3| len xs = case xs of
4|   _::rest -> 1 + len rest
5|   []      -> 0
6|   []      -> 9999
     ^^
Any value with this shape will be handled by a previous pattern, so it should be
removed.

Higher-Order Functions

The classics:

> List.filter
<function> : (a -> Bool) -> List a -> List a

> List.filter (\x -> reminderBy 2 x == 0) (List.range 1 10)
[2,4,6,8,10] : List Int

> List.map
<function> : (a -> b) -> List a -> List b

> List.map (\x -> x ^ 2) (List.range 1 10)
[1,4,9,16,25,36,49,64,81,100] : List number

> List.foldr
<function> : (a -> b -> b) -> b -> List a -> b

> List.foldl
<function> : (a -> b -> b) -> b -> List a -> b

A quick refresher on how folding from the right and left differ:

List.foldr f init [e1, e2, e3]
  === f e1 (f e2 (f e3 init))
  === init |> f e3 |> f e2 |> f e1

List.foldl f init [e1, e2, e3]
  === f e3 (f e2 (f e1 init))
  === init |> f e1 |> f e2 |> f e3

Thus:

> List.foldr (\x acc -> x :: acc) [] (List.range 1 10)
[1,2,3,4,5,6,7,8,9,10] : List number

> List.foldl (\x acc -> x :: acc) [] (List.range 1 10)
[10,9,8,7,6,5,4,3,2,1] : List number

For any (well-typed) function expression e, the function (\x -> e x) is said to be eta-equivalent to e. The verbose version is said to be eta-expanded whereas the latter is eta-contracted.

The following emphasizes that the lambda used in the last call to List.foldl above is eta-expanded:

> (::)
<function> : a -> List a -> List a

> List.foldl (\x acc -> (::) x acc) [] (List.range 1 10)
[10,9,8,7,6,5,4,3,2,1] : List number

The eta-reduced version is nicer:

> List.foldl (::) [] (List.range 1 10)
[10,9,8,7,6,5,4,3,2,1] : List number

Datatypes and Pattern Matching

List is a built-in inductive, algebraic datatype. You can define your own custom types (or "datatypes" or "tagged unions" or or "disjoint sums" or "sums-of-products"). Each type constructor is defined with one or more data constructors, each of which is defined to "hold" zero or more values.

> type Diet = Herb | Carn | Omni | Other String

> Carn
Carn : Diet

> Omni
Omni : Diet

> Other "Lactose Intolerant"
Other ("Lactose Intolerant") : Diet

Non-nullary data constructors are themselves functions:

> Other
<function> : String -> Diet

Use datatypes to simulate "heterogeneous" lists of values:

> diets = [Herb, Herb, Omni, Other "Vegan", Carn]
[Herb,Herb,Omni,Other "Vegan",Carn] : List Diet

Pattern matching is the (only) way to "use," or "destruct," constructed values. Patterns that describe values of a datatype t are either:

  1. variables,
  2. the wildcard pattern (written _), or
  3. data constructors of t applied to an appropriate number of patterns for that data constructor.

For example:

> maybeHuman d = case d of \
|   Carn -> False \
|   _    -> True
<function> : Diet -> Bool

> List.map maybeHuman diets
[True,True,True,True,False] : List Bool

A variable pattern matches anything. The wildcard pattern also matches anything; it is useful when the value it binds does not need to be referred to in the subsequent branch expression.

As before, be careful with non-exhaustive and redundant patterns.

The fact that Elm prevents shadowing helps prevent the following bug — attempting to use a variable in scope as a pattern that matches the value in binds — that pops up pretty frequently when learning functional programming:

> carn = Carn
Carn : Diet

> isCarn d = case d of \
|   carn -> True \
|   _    -> False

-- SHADOWING -------------------------------------------------------------- REPL

The name `carn` is first defined here:

4| carn = Carn
   ^^^^
But then it is defined AGAIN over here:

7|   carn -> True
     ^^^^
...

(If shadowing were allowed, then redundant pattern checking would also catch this bug.)

Patterns can be nested. For example, the function ...

firstTwo xs =
  case xs of
    x::ys -> case ys of
               y::_ -> (x, y)
               []   -> Debug.todo "firstTwo"
    []    -> Debug.todo "firstTwo"

... can be written more clearly as follows:

firstTwo xs =
  case xs of
    x::y::_ -> (x, y)
    _       -> Debug.todo "firstTwo"

Test your understanding: what's the type of firstTwo?

Type Aliases

Defining an alias or synonym for an existing type:

type alias IntPair = (Int, Int)

Types for Errors

Our unsafe_head function above fails with a run-time error when its argument is non-empty. Another way to deal with error cases is to track them explicitly, by introducing data values that are used explicitly to represent the error, or the lack of a meaningful answer.

For example, the type

> type MaybeInt = YesInt Int | NoInt

describes two kinds of values: ones labeled YesInt that do come bundled with an Int, and ones labeled NoInt that do not come bundled with any other data. In other words, the latter can be used to encode when there is no meaningful Int to return:

> head xs = case xs of \
|   x::_ -> YesInt x \
|   []   -> NoInt
<function> : List Int -> MaybeInt

> head (List.range 1 4)
YesInt 1 : MaybeInt

> head []
NoInt : MaybeInt

Ah, much better than a run-time error!

This MaybeInt type is defined to work only with Ints, but the same pattern — the presence or absence of a meaningful result — will emerge with all different types of values.

Polymorphic datatypes to the rescue:

> type MaybeData a = YesData a | NoData

As when calling polymorphic functions, type variables for type constructors like MaybeData get instantiated to particular type arguments in order to match the kinds of values it is being used with.

Polymorphic datatypes and polymorphic functions make a formidable duo:

> head xs = case xs of \
|   x::_ -> YesData x \
|   []   -> NoData
<function> : List a -> MaybeData a

> head ['a','b','c']
YesData 'a' : MaybeData Char

> head ["a","b","c"]
YesData 'a' : MaybeData String

> head (List.range 1 4)
YesData 1 : MaybeData number

> head []
NoData : MaybeData a

"For every type a, NoData has type a." Cool, NoData is a polymorphic constant and its type may be instantiated, or specialized, depending on how it is used.

The MaybeData pattern is so common that there's a library called Maybe that provides the following type, which is like ours but with different names:

type Maybe a = Just a | Nothing

There's also a related library and type called Result that generalizes the Maybe pattern. Check them out, and also see IntroML.elm for a couple simple examples.

Let-Expressions

So far we have worked only with top-level definitions. Elm's let-expressions allow the definition of variables that are "local" to the enclosing scope. As with other language features, whitespace matters so make sure equations are aligned.

plus3 a =
  let b = a + 1 in
  let c = b + 1 in
  let d = c + 1 in
    d

No need to write so many lets and ins:

plus3 a =
  let
    b = a + 1
    c = b + 1
    d = c + 1
  in
    d

Too many local variables can sometimes obscure meaning (just as too few variables can). In this case, the "pipelined" definition

plus3 a = a |> plus 1 |> plus 1 |> plus 1

and, better yet, the definition by function composition

plus3 = plus 1 << plus 1 << plus 1

are, arguably, more readable.

List Concatenation

There's a "primitive typeclass" (in addition to number and comparable, discussed above) called appendable, which describes types including lists and strings:

> (++)
<function> : appendable -> appendable -> appendable

> "hello" ++ " world"
"hello world" : String

> List.range 1 10 ++ List.range 11 20
[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20] : List number

Records

Records are like tuples, where the components (i.e. fields) are denoted by name rather than position and where the order of components is irrelevant. Record patterns bind the values of components by name, and they can omit fields that are not needed.

> type alias Point = { x : Int, y : Int }

> let {x,y} = {y=2, x=1} in x + y
3 : number

> let {x} = {y=2, x=1} in x
1 : number

Read more about records. Records can be polymorphic and even extensible:

type alias PolymorphicPoint number = { x : number, y : number }

type alias PointLike a number = { a | x : number, y : number }

Datatypes, record types, and type aliases are orthogonal:

> type alias T = {x:String}
> type S1 = S1 {x:String}
> type S2 = S2 T
> type U = U1 T | U2 {x:Int} | U3 (Int, String) | U4


Reading

Required

  • Look through more of the Standard Libraries
  • Make sure you are looking at Elm 0.19.x
        https://package.elm-lang.org/packages/elm/core/
    rather than pre-0.19
        https://package.elm-lang.org/packages/elm-lang/core/

Additional

  • Miscellaneous notes from last year: 2020.0410
  • If you would like to see the syntax and features of two other ML dialects, Standard ML and OCaml, take a look at this and this.