In this part of the project, you will build a parsing module for FLAME. To do this, you will build upon your lexical analysis module and use an additional code generation tool for automatically constructing LR parsers called "pyson" (a horrible Python derivative of "bison"--the popular GNU LALR(1) parser generation tool).
One of the most popular compiler construction tools is often referred to as "yacc" (Yet Another Compiler Compiler). The name itself is a bit misleading as there are literally dozens of versions of "yacc" tools available for different platforms, different implementation languages, and different parsing algorithms. On Unix systems, "yacc" is a standard utility that can be used to build parsers in C and C++. The GNU version of "yacc" is known as "bison". A Berkeley version known as "byacc" is also available on many platforms.
Most yacc-based tools rely on some form of LR parsing---typically LALR(1). Whenever someone refers to a yacc-generated parser, it means that they have used some variant of these tools. Also, unless it is otherwise stated, it usually implies that the parser is using LR parsing.
For this project, you will use a yacc-like tool for Python to create the the parser for FLAME. As output, your parser will create an abstract syntax tree (AST) of the input program. In addition, your parser will have to report various types of syntax errors appearing in the input file.
Create a file called 'flameparse.py' that contains the parser definitions for your language. This file is going to contain a collection of Python functions corresponding to the rules in the grammar you defined for Project 1.
The best way to describe the contents of the parser file is to look at an example. The calculator example in /stage/classes/current/CS326/Demo/calc has a complete implementation of a parser for a toy calculator language. The file 'calcparse.py' has extensive comments describing the system and can be used as a template for your parser.
To test your parser, simply run it on a test file. For example:
The primary output of your parser will be a list of parsing errors (if any). These will be used as the basis for most of our tests. In addition, you may want to include some code to print the parse tree. For example, your code might produce output similar to this:% python flameparse.py hello.flm
Please note that the tree structure of your parser will depend on how you write your parsing rules and the names you decide to use in your AST (they do not need to match the example above).% python flameparse.py hello.flm :::: Parse Tree :::: program +-- funclist +-- function (main) |-- parmlist |-- varlist | |-- variable | | +-- parm (x) | | +-- type_int | +-- variable | +-- parm (y) | +-- type_int +-- statements |-- print ("Hello World\n") |-- read | +-- loc_id (x) |-- assign | |-- loc_id (y) | +-- expr_binary (*) | |-- expr_int (3) | +-- expr_id (x) +-- write +-- expr_id (y)
To test your parser, we will run it out a variety of input files to see if it produces the expected output. In this case, the expected output will be nothing more than error messages (or lack of error messages) as produced by your lexer and parser.
The /stage/classes/current/CS326/Tests/project2 directory contains a variety of test files and samples of the expect output. Keep in mind, your implementation must exactly produce the same output as the reference to pass. However, due to differences in parse tree representations and rule names, these tests will not examine the parse-tree output above.
The most detailed documentation about 'pyson' can be found by reading the calculator demo file calcparse.py. If you have previous experience with yacc or bison, most of the concepts associated with the use of those tools translate directly to the use of pyson.
% python flameparse.py testname.flmIn addition, we must be able to run the old lexing tests as for project 1. For example:
% python flamelex.py testname.flm
% tar -cf yourlogin.tar flame
% cp yourlogin.tar /stage/classes/current/CS326/handin/project2