General Information
Instructor: |
John Reppy |
|
TA: |
Byron Zhong |
|
Skye Soss |
||
Lectures: |
TR 14:00-15:20 |
Ryerson 276 |
Recitations: |
W 16:30-17:20 |
Cobb 110 |
Course Description
A vast majority of computer programs must deal with textual input of some form or another. This input can range from simple configuration languages to data description languages (e.g., XML) to scripting languages to full-blown programming languages. In this course, we cover the tools and techniques used to process the full range of computer languages (i.e., languages that specify programs and data on computers). Topics include scanning and parsing, tree representations of structured input, simple typechecking, translation between intermediate forms, code generation and optimization, and some run-time system issues. This course is a project oriented course in which students will construct a fully working compiler for a small programming language. We use Standard ML (SML) as the primary implementation language for the programming projects.
It is expected that you comfortable with functional programming (e.g., as covered in either CMSC 22100, CMSC 22300, or CMSC 22311) and that you have taken CMSC 14400 (or CMSC 15400).
Lectures
The topics that the class covers are organized into eight modules, with a varying number of lectures per module.
-
Course introduction and overview (1 lecture)
-
Scanning (2-3 lectures)
-
Parsing (3-4 lectures)
-
Checking Static Semantics (2 lectures)
-
Intermediate Representations (1-2 lectures)
-
Program Analysis (2 lectures)
-
Optimization (2-3 lectures)
-
Code Generation and Runtime Systems (2 lectures)
Recitations
There is a scheduled recitation on Wednesdays starting at 4:30pm. We use these recitations to provide additional information that is useful for the course project. You are expected to attend recitation and are responsible for the material covered in recitation.
Here is a tentative schedule for the recitations:
Week | Date | Topic |
---|---|---|
1 |
3/20 |
Crash course in Standard ML |
2 |
3/27 |
Project 1 Overview and using ML-ULex |
3 |
4/3 |
Using ML-Antlr |
4 |
4/10 |
Binding Rules (Project 2) |
5 |
4/17 |
Project 2 Q&A |
6 |
4/24 |
Project 3 Overview |
7 |
5/1 |
Project 4 Overview |
8 |
5/8 |
LLVM |
9 |
5/15 |
LLVM Garbage Collection Hooks |
Office Hours
We have scheduled in-person office hours Mondays and Fridays.
Weekday | Time | Host and Location |
---|---|---|
Monday |
12:30-13:30 |
Skye Soss (JCL 205) |
Thursday |
15:30-16:30 |
John Reppy (JCL 253) |
Friday |
14:30-15:30 |
Byron Zhong (CSIL 1) |
You may also schedule a meeting (either in-person or via Zoom) with the instructor by sending email to jhr@cs.uchicago.edu.
Course Project
The project for the course is to implement a small functional programming language, called Ovid. In addition, there will be a small programming assignment (aka Project 0) to get you started with SML programming.
The project grade will comprise the bulk of your course grade, and will involve a significant effort to complete, so it is important that you give it sufficient time and effort. The main project is divided into four parts. Because the seed code for each part includes a solution for the previous parts, we do not accept late submissions or give extensions, but you should have ample time to complete the assignments (as long as you do not leave them to the last minute).
The tentative due dates for the programming projects are as follows:
Post Date | Description | Due Date |
---|---|---|
03/20 |
Project 0 — Propositional Formulae |
03/29 |
03/26 |
Project 1 — Scanning |
04/04 |
04/05 |
Project 2 — Parser and Binding Analyzer |
04/17 |
04/18 |
Project 3 — Type Checker |
04/30 |
05/01 |
Project 4 — Closure Converter |
05/09 |
05/10 |
Project 5 — Code Generator |
05/23 |
Projects will be due at 23:59 (Chicago time) on their due date.
Project Documents
The project description and other documents will be posted on Ed Discuss.
Standard ML
As mentioned above, we use the Standard ML language for the programming projects in this course. Standard ML (usually refered to as SML) is a strict functional programming language with a long history an a number of implementations. For this course, we use Version 110.99.5 of the Standard ML of New Jersey (SML/NJ) system. This software is open source and supports Intel/AMD processors running Linux (32 and 64-bit), macOS (32 and 64-bit), and Windows (32-bit).
Note
|
While it will be possible to develop the first three parts of the course project on Windows, the last part requires either Linux or macOS running a 64-bit system. |
Textbooks
The textbook for this class is
Modern Compiler Implementation in ML,
by Andrew Appel. Cambridge University Press, 1998.
This book is a good overview of compilers that uses SML as its implementation language
(there are also versions for C and Java).
We provide links to online SML programming resources below, but if you are looking for a good book on the topic, we recommend
ML for the Working Programmer (2nd Ed.),
by L.C. Paulson, Cambridge University Press, 1996.
This book is one of the better introductions to programming with SML.
Online Resources
SML Tutorials
There are a number of online SML tutorials. We recommend the following:
-
Programming in Standard ML (PDF) by Robert Harper.
-
Programming in Standard ML '97: An On-line Tutorial (HTML) by Stephen Gilmore.
-
Learn some Standard ML! by Adam Shaw. These are a series of videos that were prepared for CMSC 22100 (Programming Languages).
SML Documentation
-
A Refined Syntax of Standard ML
The collected syntax of Standard ML with some refinements. -
The Standard ML Basis Library
The Basis Library provides operations on standard types (e.g.,bool
,int
,list
,string
, etc.), as well as support for access to system services like I/O and networking. There have been some extensions to the Basis Library, which are documented on GitHub. -
The SML/NJ Library is a collection of libraries that implement higher-level, application-oriented, data structures and algorithms. In this course, we primarily use modules from the Util library.
LLVM Documentation
The last part of the project will use the LLVM toolchain to produce machine code that can be executed.
Academic Honesty
Note
|
The following discussion is owed to Stuart Kurtz |
The University of Chicago is a scholarly academic community. You need to both understand and internalize the ethics of our community. A good place to start is with the Cadet’s Honor Code of the US Military Academy: "A Cadet will not lie, cheat, or steal, or tolerate those who do." It is important to understand that the notion of property that matters most to academics is ideas, and that to pass someone else’s ideas off as your own is to lie, cheat, and steal.
The University has a formal policy on Academic Honesty, which is somewhat more verbose than West Point’s. Even so, you should read and understand it.
We believe that student interactions are an important and useful means to mastery of the material. We recommend that you discuss the material in this class with other students, and that includes the homework assignments. So what is the boundary between acceptable collaboration and academic misconduct? First, while it is acceptable to discuss homework, it is not acceptable to turn in someone else’s work as your own. When the time comes to write down your answer, you should write it down yourself from your own memory. Moreover, you should cite any material discussions, or written sources, e.g.,
Note: I discussed this exercise with Jane Smith.
The University’s policy, for its relative length, says less than it should regarding the culpability of those who know of misconduct by others, but do not report it. An all too common case has been where one student has decided to "help" another student by giving them a copy of their assignment, only to have that other student copy it and turn it in. In such cases, we view both students as culpable and pursue disciplinary sanctions against both.
For the student collaborations, it can be a slippery slope that leads from sanctioned collaboration to outright misconduct. But for all the slipperyness, there is a clear line: present only your ideas as yours and attribute all others.
If you have any questions about what is or is not proper academic conduct, please ask your instructors.