CS326 - Project 2 : Parsing

Due Date: Friday, April 20, 11:59 p.m.

In this part of the project, you will build a parsing module for FLAME. To do this, you will build upon your lexical analysis module and use an additional code generation tool for automatically constructing LR parsers called "pyson" (a horrible Python derivative of "bison"--the popular GNU LALR(1) parser generation tool).

Introduction

Most modern compilers are built with the help of automated parser generation tools that attempt to create parsing code from rules specified in a BNF. This approach is particularly well suited to compiler construction because many of the most common parsing algorithms are driven by tables. Furthermore, the construction of these tables is extremely tedious, difficult, and error-prone if you try to do it by hand.

One of the most popular compiler construction tools is often referred to as "yacc" (Yet Another Compiler Compiler). The name itself is a bit misleading as there are literally dozens of versions of "yacc" tools available for different platforms, different implementation languages, and different parsing algorithms. On Unix systems, "yacc" is a standard utility that can be used to build parsers in C and C++. The GNU version of "yacc" is known as "bison". A Berkeley version known as "byacc" is also available on many platforms.

Most yacc-based tools rely on some form of LR parsing---typically LALR(1). Whenever someone refers to a yacc-generated parser, it means that they have used some variant of these tools. Also, unless it is otherwise stated, it usually implies that the parser is using LR parsing.

For this project, you will use a yacc-like tool for Python to create the the parser for FLAME. As output, your parser will create an abstract syntax tree (AST) of the input program. In addition, your parser will have to report various types of syntax errors appearing in the input file.

Part 1 - Fix Project 1

Modify your lexer to fix all the tests you failed in Project 1. You will recover half of the points deducted in Project 1 by passing the tests when Project 2 is handed in.

Part 2 - FLAME Parser

Copy the file pyson.py from the directory /stage/classes/current/CS326/Tools to your own directory. "pyson" is the yacc compiler tool for Python that you will use to construct your parser.

Create a file called 'flameparse.py' that contains the parser definitions for your language. This file is going to contain a collection of Python functions corresponding to the rules in the grammar you defined for Project 1.

The best way to describe the contents of the parser file is to look at an example. The calculator example in /stage/classes/current/CS326/Demo/calc has a complete implementation of a parser for a toy calculator language. The file 'calcparse.py' has extensive comments describing the system and can be used as a template for your parser.

To test your parser, simply run it on a test file. For example:

% python flameparse.py hello.flm

The primary output of your parser will be a list of parsing errors (if any). These will be used as the basis for most of our tests. In addition, you may want to include some code to print the parse tree. For example, your code might produce output similar to this:

% python flameparse.py hello.flm
:::: Parse Tree ::::
program
  +-- funclist
        +-- function (main)
              |-- parmlist
              |-- varlist
              |     |-- variable
              |     |     +-- parm (x)
              |     |           +-- type_int
              |     +-- variable
              |           +-- parm (y)
              |                 +-- type_int
              +-- statements
                    |-- print ("Hello World\n")
                    |-- read
                    |     +-- loc_id (x)
                    |-- assign
                    |     |-- loc_id (y)
                    |     +-- expr_binary (*)
                    |           |-- expr_int (3)
                    |           +-- expr_id (x)
                    +-- write
                          +-- expr_id (y)

Please note that the tree structure of your parser will depend on how you write your parsing rules and the names you decide to use in your AST (they do not need to match the example above).

To test your parser, we will run it out a variety of input files to see if it produces the expected output. In this case, the expected output will be nothing more than error messages (or lack of error messages) as produced by your lexer and parser.

The /stage/classes/current/CS326/Tests/project2 directory contains a variety of test files and samples of the expect output. Keep in mind, your implementation must exactly produce the same output as the reference to pass. However, due to differences in parse tree representations and rule names, these tests will not examine the parse-tree output above.

The most detailed documentation about 'pyson' can be found by reading the calculator demo file calcparse.py. If you have previous experience with yacc or bison, most of the concepts associated with the use of those tools translate directly to the use of pyson.

Handin procedures

Your parser must be contained in a file 'flameparse.py'. We must be able to run this file as follows to run a test:
```
% python flameparse.py testname.flm
```
In addition, we must be able to run the old lexing tests as for project 1. For example:
```
% python flamelex.py testname.flm
```
Make sure you have a README file that includes your name and anything notable about your implementation.
Create a Unix tar file for your project. This tar file should contain the 'flame' directory and be based on your login name. For example:
```
% tar -cf yourlogin.tar flame
```
On classes.cs.uchicago.edu, copy your tar file to the directory /stage/classes/current/CS326/handin/project2 For example:
```
% cp yourlogin.tar /stage/classes/current/CS326/handin/project2
```
Late assignments are not accepted. The handin directory will be closed after the due date.