Homework #3

Due: Wednesday October 20th at 11:59pm

This homework is intended to serve as an introduction to understanding the process of syntactical analysis. Specifically, we will look at requirements needed to build a parser that makes up the syntactical analysis phase.

Getting started

For each assignment, a Git repository will be created for you on GitHub. However, before that repository can be created for you, you need to have a GitHub account. If you do not yet have one, you can get an account here: https://github.com/join.

To actually get your private repository, you will need this invitation URL:

When you click on an invitation URL, you will have to complete the following steps:

  1. You will need to select your CNetID from a list. This will allow us to know what student is associated with each GitHub account. This step is only done for the very first invitation you accept.

Note

If you are on the waiting list for this course you will not have a repository made for you until you are admitted into the course. I will post the starter code on Ed so you can work on the assignment until you are admitted into the course.

  1. You must click “Accept this assignment” or your repository will not actually be created.

  2. After accepting the assignment, Github will take a few minutes to create your repository. You should receive an email from Github when your repository is ready. Normally, it’s ready within seconds and you can just refresh the page.

  3. You now need to clone your repository (i.e., download it to your machine).
    • Make sure you’ve set up SSH access on your GitHub account.

    • For each repository, you will need to get the SSH URL of the repository. To get this URL, log into GitHub and navigate to your project repository (take into account that you will have a different repository per project). Then, click on the green “Code” button, and make sure the “SSH” tab is selected. Your repository URL should look something like this: git@github.com:mpcs51300-aut21/hw3-GITHUB-USERNAME.git.

    • If you do not know how to use git clone to clone your repository then follow this guide that Github provides: Cloning a Repository

If you run into any issues, or need us to make any manual adjustments to your registration, please let us know via Ed Discussion.

Short Answer Questions

Each assignment may include a section dedicated to answering a few short answer questions. Please turn in a readable text file named saqs (e.g., saqs.pdf). Store this file inside the hw3/saqs directory. You will place your answers to the following questions in this document. The file is not required to be a PDF but any text document that is easy to open is fine. You may place your solution in more than one file but make sure to label file clearly so we know what question your file is referring too.

SAQ 1

Write a context-free grammar for the syntax of regular expressions.

Hint: Define the grammar with respect to the operations of regular expressions: alternation, concatenation, etc. Refer back to the module 2 slides for the basic operations that makeup a regular expression (don’t include the shorthand notations since they all can be rewritten using the basic operations). Futhermore, if given something like ab* you are trying to define the grammar such that you can verify that it is a valid sentence of the regular expression language.

The Cal Language Grammar

Here is the Extended Backus-Naur form (EBNF) for the Cal Language. Use this grammar as guideline to implement the parser. In the EBNF below, non-terminals are highlighted in blue with a = instead of an -> as shown in class. The notation { ... } means zero or more of the symbols defined in the braces. You’ll notice a series of semi-colons lined up at the end of each line in the EBNF. Please ignore those semi-colons because they are needed to signal the end of a production. The language does contain semicolons to signal the end of a statement but those are terminals and are indicated as ';'. The terminals are highlighted in green with single quotes.

Program = Block 'eof'                       ;
Block = {Statement}                         ;
Statement = Let | Assignment | Print        ;
Let = 'let' 'id' '=' Expression ';'         ;
Assignment = 'id' '=' Expression ';'        ;
Print = 'print' Expression ';'              ;
Expression = Term ExpressionPrime           ;
ExpressionPrime = '+' Term ExpressionPrime  ;
ExpressionPrime = '-' Term ExpressionPrime  ;
ExpressionPrime = 'ε'                       ;
Term = Factor TermPrime                     ;
TermPrime = '*' Factor TermPrime            ;
TermPrime = '/' Factor TermPrime            ;
TermPrime = 'ε'                             ;
Factor = 'INT' | 'id'                       ;

Note

The language designers added the integer 0 back into the language. However, this doesn’t affect the coding for this assignment.

Programming Question: Simple Parser

In homework #2 you created a scanner for the toy language, Cal. For this assignment, you will create a recursive decent parser with backtracking for the language. Make sure to watch the video on recursive decent parsing.

To help you get started with the structure of parser, we have provided some template code inside the hw3/parser and hw3/token directories. Lets first look at the hw3/token/token.go file

 package token

 type TokenType string

 const (
       ILLEGAL = "ILLEGAL"
       EOF     = "EOF"

       //Numbers
       INT = "INT"

       //Keywords and identifier
       LET   = "let"
       PRINT = "print"
       ID    = "identifier"

       //Operators
       ASSIGN   = "="
       PLUS     = "+"
       MINUS    = "-"
       ASTERISK = "*"
       SLASH    = "/"

       //deliminters
       SEMICOLON = ";"
)

 type Token struct {
   Type    TokenType
   Literal string
 }

You will not use your scanner from the previous assignment. The parser will take in a stream of tokens defined in this file. Do not modify this file without permission from Professor Samuels.

The actual implementation of the parser will be defined in the hw3/parser/parser.go file. Inside this file, you have the following code

type Parser struct {
   //IMPLEMENT ME
 }

 func New(tokens []ct.Token) *Parser {
   // IMPLEMENT ME
   return &Parser{}
 }

 func (p *Parser) Parse() bool {
  // IMPLEMENT ME
   return false

 }

You may update the Parser struct with additional fields to help you implement the parser. The New initializes and allocates a new parser. You should think of the argument tokens as the list of tokens generated by the scanner.**Assume that you will not see any illegal tokens in this slice.** In the compiler project for this course, we will explain how to handle that situation but for now assume you are given valid tokens. The Parse() method will verify the tokens make up a structurally valid Cal program by using a recursive decent parser with backtracking You can modify this code as you please. You do not need to print out any parse errors. We will discuss this in more detailed in a future video.

Testing your implementation

For this assignment, there are no specific test-cases. We want you to be able to test your compiler components on your own since you’ll have to do this for the compiler project. However, we have provided some template code that will help you get started testing your code inside the hw3/parser/parser_test.go. Running this file will produce errors, initially! You should only run this file once you have portions of your implementation working. For this assignment, you must add additional tests until you feel confident that your parser works correctly. We will not say how many tests you will need to add. You can have one, two, more tests as long as it verifies that the different forms of writing a Cal program are validated correctly.

The code is self-explanatory on what it is accomplishing. However, If you don’t understand something in this file then please write on Ed and we’ll be happy to clarify anything.

Grading

For this assignment, the weights will be:

  • SAQ 1 5%

  • Parser:Test File 20%

  • Parser:Correctness 65%

  • Parser:Design & Style 10%

Design, Style and Cleaning up

Before you submit your final solution, you should, remove

  • any Printf statements that you added for debugging purposes and

  • all in-line comments of the form: “YOUR CODE HERE” and “TODO …”

  • Think about your function decomposition. No code duplication. This homework assignment is relatively small so this shouldn’t be a major problem but could be in certain problems.

Go does not have a strict style guide. However, use your best judgment from prior programming experience about style. Did you use good variable names? Do you have any lines that are too long, etc.

As you clean up, you should periodically save your file and run your code through the tests to make sure that you have not broken it in the process.

Submission

Before submitting, make sure you’ve added, committed, and pushed all your code to GitHub. You must submit your final work through Gradescope (linked from our Canvas site) in the “Homework #3” assignment page via two ways,

  1. Uploading from Github directly (recommended way): You can link your Github account to your Gradescope account and upload the correct repository based on the homework assignment. When you submit your homework, a pop window will appear. Click on “Github” and then “Connect to Github” to connect your Github account to Gradescope. Once you connect (you will only need to do this once), then you can select the repsotiory you wish to upload and the branch (which should always be “main” or “master”) for this course.

  2. Uploading via a Zip file: You can also upload a zip file of the homework directory. Please make sure you upload the entire directory and keep the initial structure the same as the starter code; otherwise, you run the risk of not passing the automated tests.

Depending on the assignment, once you submit your work, an “autograder” will run. This autograder should produce the same test results as when you run the code yourself; if it doesn’t, please let us know so we can look into it. A few other notes:

  • You are allowed to make as many submissions as you want before the deadline.

  • Please make sure you have read and understood our Late Submission Policy.

  • Your completeness score is determined solely based on the automated tests, but we may adjust your score if you attempt to pass tests by rote (e.g., by writing code that hard-codes the expected output for each possible test input).

  • Gradescope will report the test score it obtains when running your code. If there is a discrepancy between the score you get when running our grader script, and the score reported by Gradescope, please let us know so we can take a look at it.