Note: Homework assignments 0, 1, and 2 were short exercises in fairly straightforward programming. Beginning with assignment 3, we will be tackling a fairly in-depth chunck of code. If you had a significant amount of trouble with the early assignments, you may want to consider switching to CMSC 15100 soon, as the last day to do so is this Friday. Working in Stages: - It is almost always a great idea to break a large programming problem into simpler subproblems that are possibly interdependent, testing each thoroughly, and combining them into the finished product. Appropriate subproblems must make sense on their own, and must be able to be demonstrated effectively with high confidence. Each subproblem must also take a reasonable time to complete, to minimize the necessity to stretch a single problem over multiple sittings. The sequence of subproblem completion must successively approach the completion of the final product. - In Assignment 3, there are several nonteminals (, , , ) to implement in parsing with the optionally parenthesized grammar. Working in stages here implies a cummulative implementation of each of four non-terminal grammars. That is, we start at the bottom, implementing the grammar. We then implement the grammar, which uses the grammar. Then we incorporate the grammar, which uses the previous two, etc. This is essential, for debugging all four at once the entire piece at once would approach impossible. - When work is done in stages, if a subproblem is completely tested and debugged, debugging another subproblem that uses the previous prevents having to follow the code into the previously debugged subproblem. In a largely recursive structure as used here, this can save you endless amounts of time. Some Info on Optionally Parenthesized Grammar Procedures: - The function (parse_ toklist), where is some nonterminal (i.e. fmla, term, etc.) finds the longest prefix of the input list of tokens that is derivable from and returns the following list: (list ) where is the abstract syntax representation of the prefix and is the list of tokens remaining after the prefix is removed. (See the Assignment 3 writeup for more information). - Consider the following example: Suppose we want to parse the string "2^3^4". After breaking this string into a token list, the resulting input to the parse_fmla function would be (list "2" "^" "3" "^" "4") Early in the project before the and grammars are implemented, the phony parse_fmla function would pass this factor expression straight to parse_factor. Let's consider this simplified case for now. In parse_factor, we have two productions in the grammar, namely, ::= ^ ::= Since both productions start with , parse_factor passes the entire token list onto parse_base. parse_base identifies 2 as the longest prefix corresponding to one of the productions, in this case as there is not "fib" or "(" in the list. parse_base then returns the abstract syntax for the number (i.e. 2) and the remaining token list after extracting the "2" token. parse_factor then saves the abstract syntax for 2 using a let* construction and continues to parse the rest of the token list. parse_factor then comes upon the "^" token, which signals that we are dealing with the ::= ^ production. The parse factor removes the first "^" and passes the rest of the token list, namely (list "3" "^" "4") on to a recursive call of parse_factor. Again, this list gets passed on to the parse_base function, which again returns the abstract syntax for "3" and a list of the remaining tokens (list "^" "4"). This nested call of parse_factor then identifies the second "^", removes it and passes the rest of the token list, now (list "4"), onto a third nested call of parse_factor. This third call passes the (list "4") onto parse_base, which returns the abstract syntax for "4". The third parse_factor call then returns the 4 back to the second parse_factor call, which uses it and the 3 it got earlier to construct the abstract syntax (list 'expt 3 4) which it returns to the first parse_factor call. This first call then uses it and the 2 it got earlier to construct the abstract syntax for the whole expression, namely, (list 'expt 2 (list 'expt 3 4)) This whole process has the following flow (nested calls are numbered in parentheses): - (list "2" "^" "3" "^" "4") --> parse_factor (1) - parse_factor (1) passes (list "2" "^" "3" "^" "4")) to parse_base - parse_base returns (list 2 (list "^" "3" "^" "4")) to parse_factor (1) - parse_factor (1) saves 2, ignores "^" but remembers its using an expt production, and passes (list "3" "^" "4") onto parse_factor (2) - parse_factor (2) passes (list "3" "^" "4") onto parse_base - parse_base returns (list 3 (list "^" "4")) - parse_factor (2) saves 3, ignores "^" but remembers it's using an expt production, and passes (list "4") onto parse_factor (3) - parse_factor (3) passes (list "4") onto parse_base - parse_base returns (list 4 (list)) to parse_factor (2) - parse_factor (3) returns (list 4 (list)) to parse_factor(2) - parse_factor (2) returns (list (list 'expt 3 4) (list)) to parse_factor (1) - parse_factor (1) returns (list (list 'expt 2 (list 'expt 3 4)) (list)) That is to say, parse_factor (1) returns a two item list, the first item the abstract syntax list for the input token list and the second item, a list remaining tokens, which in this case is empty.