Milestone #1: Front-End (Scanner & Parser)¶
Due: Friday October 29th at 11:59pm
This milestone will require you to get started implementing the front-end for the compiler.
Working Individually or as a Pair¶
You are allowed to work alone or in teams of two for the compiler project. Before creating your repository via the invitation, you will first need to
Decide if you want to work alone or find a partner in the class.
Come up with a fun team name or if you are working alone a name for your compiler.
The first person (or yourself if you are working alone) to accept the Github classroom invite will enter in the team name. If working in a pair, the second group member will need to click on their team name in-order to join the repository.
Once you have done these two steps, then you can move on to Getting started section.
Getting started¶
A Git repository will be created for you on GitHub. However, before that repository can be created for you, you need to have a GitHub account. If you do not yet have one, you can get an account here: https://github.com/join.
To actually get your private repository, you will need this invitation URL:
When you click on an invitation URL, you will have to complete the following steps:
See step 3 in “Working Individually or as a Pair” section for the first step.
- You now need to clone your repository (i.e., download it to your machine).
Make sure you’ve set up SSH access on your GitHub account.
For each repository, you will need to get the SSH URL of the repository. To get this URL, log into GitHub and navigate to your project repository (take into account that you will have a different repository per project). Then, click on the green “Code” button, and make sure the “SSH” tab is selected. Your repository URL should look something like this: git@github.com:mpcs51300-aut21/proj-TEAM_NAME-GITHUB-USERNAME.git.
If you do not know how to use
git clone
to clone your repository then follow this guide that Github provides: Cloning a Repository
If you run into any issues, or need us to make any manual adjustments to your registration, please let us know via Ed Discussion.
Language Overview: GoLite¶
Please make sure you read over the document that describes the language we will be implementing this quarter: Language Overview
Compiler Structure¶
The inside the proj
directory of your repository is relatively empty. We are allowing you to structure your compiler as you wish. The only requirement is maintain the proj/golite
directory that contains the golite.go
file. This is the main application package for the compiler. The way you structure the additional directories and files is up to you. However, we recommend that you have a separate directory for each major compiler component. This will make it easier for you to test and implement each independently of each other.
It will be your responsibility to determine how each component communicates with each other (i.e., what package to import in another package etc.). However, if you need some help figuring out this method then please write on Ed or come to office hours.
Milestone Requirement: Scanner¶
The main requirement for the first milestone is to get a working scanner completed. The time complexity of the scanner must still be O(n) where n is the number of Here are some suggestions for implementing the scanner:
You can use the automated scanning process to implement the compiler but it will be many states. I would recommend instead you sit down and define either the DFA directly by studying the language or use the regex (recommended approach) package.
The
regex
package is doing all the DFA simulation and identification for you behind the scenes. Make sure to only compile the regular expressions once and then use it in the process of identifying the tokens. Remember your overall algorithm must still be O(n). This will not be graded for this milestone but will be part of the overall grade for your compiler at the end of the quarter.
You will need to think about how you will pass the tokens to the parser component. Make sure to think about that structure before beginning to implement the scanner.
Running the Scanner¶
Assume the following Golite program is defined in a file simple.golite
inside the proj/golite
directory:
1 package main;
2
3 import "fmt";
4
5 func main() {
6
7 var a int;
8
9 a = 3 + 4 + 5;
10
11 fmt.Print(a);
12
13 }
For this milestone, the compiler must read in a Golite program and print out each token on a separate line for a given program. Please add the flag -lex
to command line arguments for the compiler. This will only print the tokens coming out of the scanner.
Note
Make sure to think about your function decomposition inside golite.go
file because you will be adding more command line arguments to the compiler as the quarter progresses.
Sample Run ($: is just mimicking the command line)
$: go run golite.go -lex simple.golite
Token.PACKAGE(1)
Token.IDENT(1, "MAIN")
Token.SEMICOLON(1)
Token.IMPORT(3)
Token.FMT(3)
Token.SEMICOLON(3)
Token.FUNC(5)
Token.IDENT(5, "MAIN")
Token.LPAREN(6)
... (I'm not printing them all out but you get the idea)
Your output does not have to look like the above output. We just want to see a series of tokens being printed out to verify that your scanner is working.
Milestone(Not Required): Parser¶
You should get started working on your parser for the assignment. You allowed to use any of the parsing algorithms described in lecture. Anytime your parser identified a syntax error then it must provide some context for the error and along with the line number of where the error occurred. For example, if the program does not have the package keyword then your error message can be "syntax error(1): expected package keyword"
. The parser will print out the illegal tokens identified by the scanner because as it is parsing if it finds a token that does not match a production rule then a error can be produced. For example, if the scanner sees the illegal character '#'
then it can print the error message "syntax error(LINE_NUMBER):Undefined token symbol:#"
(or something to that affect). We will talk more about the AST produced and static semantic analysis in the next milestone. However, if you get far on this part then you can reach out on Ed to get more details about the parsing output. For now, worry about correctly parsing a program and we will discuss the output in the next milestone.
Grading and What to Submit¶
You must provide the following to get full credit for this milestone:
Similar to homework #2, submit a few Golite programs (e.g.,
simple1.golite
,simple2.golite
, etc.) that correctly identify all the possible tokens that can be produced by the language.A
README
file that states how we can run the various programs to see the tokens being produced.
The milestone is 5% of your grade and the exact weights for grading are:
5% credit: 75% of all tokens are being identified.
3% credit: 50% of all tokens are being identified.
0% credit: No solution provided or less than 50% of the tokens in the language are being identified.
Note
Although we are not looking at the parser for this milestone, you should have a good portion of it completely done by the end of this milestone to make sure you are making good progress on the project.
Submission¶
Before submitting, make sure you’ve added, committed, and pushed all your code to GitHub. You must submit your final work through Gradescope (linked from our Canvas site) in the “Milestone #1” assignment page via two ways,
Uploading from Github directly (recommended way): You can link your Github account to your Gradescope account and upload the correct repository based on the homework assignment. When you submit your homework, a pop window will appear. Click on “Github” and then “Connect to Github” to connect your Github account to Gradescope. Once you connect (you will only need to do this once), then you can select the repsotiory you wish to upload and the branch (which should always be “main” or “master”) for this course.
Uploading via a Zip file: You can also upload a zip file of the assignment directory.