CS326 - Project 1 : Lexical Analysis
Due Date: Monday, March 9, 11:59 p.m.
In this part of the project, you must build a lexical analysis module
for the FLAME language. To do this, you will
an automatic lexical analysis tool "plex" (Python Lex).
The Project Ground Rules
As an experiment, I am making the compiler project an individual
programming effort. However, later stages of the project will
be difficult. Therefore, I want to encourage you to fully interact
with your classmates throughout the quarter. The only thing not
allowed is blatant copying of code or solutions.
Part 1 - Create a project directory
Create a project directory called 'flame'. This directory will contain
all of the files of your compiler and readme files. Create a file
'README' that contains your name. You are strongly encouraged
to manage your project under CVS. However, I'm not going to twist
your arm.
Part 2 - FLAME BNF
Using the informal description of FLAME in the earlier handout,
create a file 'grammar' that has a formal specification of the
entire FLAME grammar. Your specification should include the
following:
- A list of all tokens in the language defined in terms
of regular expressions. Your regular expression specifications
should be given in a syntax compatible with the Python regular
expression module (re). For example:
ID ::= '[a-zA-Z_]+(\w|_)*'
PLUS ::= '+'
INUMBER ::= '\d+'
...
- An unambiguous context free grammar that precisely defines the
syntax of FLAME. The syntax of your CFG should look roughly like this (note: this example
is ambiguous so yours should be different).
assign ::= ID := expr
expr ::= expr + expr
| expr - expr
| expr * expr
| expr / expr
| (expr)
| ID
| number
...
number ::= INUMBER
| FNUMBER
Part 3 - FLAME Lexer
Now, using the list of tokens and regular expressions defined in Part 2, you task
is to write a lexical analysis module that reads an input file as a string
and produces a list of tokens. To do this, perform the following steps:
- Copy the file plex.py from /stage/classes/current/CS326/Tools
to your own project directory.
- Next, copy the file lexdemo.py from /stage/classes/current/CS326/Demo
to your own project directory and rename it to 'flamelex.py'.
- Edit this file so that it contains rules for all of the tokens you specified
in Part 2. In addition, you will need to add rules to handle whitespace,
comments, and other parts of the input that must be ignored. Finally, you will
need to implement an error handling function l_error() that reports and recovers from
certain types of lexical errors.
To test your lexer, simply run it on a test file. For example:
% python flamelex.py test.flm
The output of the lexer should be a list of tuples of the form
(token_name, lexeme, lineno) like this:
(ID,'position',1)
(ASSIGN,':=',1)
(ID,'initial',1)
(PLUS,'+',1)
(ID,'rate',1)
(TIMES,'*',1)
(INUMBER,'60',1)
...
In addition, your lexer must produce error messages for various
types of problems in the input text.
To test your lexer, we will run it out a variety of input files to
see if it produces the expected output. The token names do not have
to match the example above, but the lexeme values and line numbers
must exactly match to pass a test.
The /stage/classes/current/CS326/Tests directory contains
a variety of test files and samples of the expect output. Keep in mind,
your implementation must exactly produce the same output including all
error messages, lexeme values, and line numbers to pass.
Detailed documentation about 'plex' can be found by reading its source
or looking at its docstring. For example:
% python
>>> import plex
>>> print plex.__doc__
...
Handin procedures
-
Your lexer must be contained in a file 'flamelex.py'. We must be able to
run this file as follows to run a test:
% python flamelex.py testname.flm
-
Make sure you have a README file that includes your name and anything notable about your implementation.
- Make sure you created a file 'grammar' that includes the formal specification
of FLAME.
-
Create a Unix tar file for your project. This tar file should contain the 'flame' directory
and be based on your login name. For example:
% tar -cf yourlogin.tar flame
- On classes.cs.uchicago.edu, copy your tar file to the directory /stage/classes/current/CS326/handin/project1
For example:
% cp yourlogin.tar /stage/classes/current/CS326/handin/project1
- Late assignments are not accepted. The handin directory will be closed
after the due date.
Errata
- The 'break' statement was added to FLAME. This has been updated in
the online documentation.