Assignment #4 is due at 11:59 PM on Wednesday 29 October, 2003.
This assignment is an exercise in the definition and use of data abstraction.
Section 2.1.2 of S&IoCP discusses data abstraction and its key component, the abstraction barrier. Data abstraction, does not provide any new power for computing particular functions. But it provides a crucial organizational principle, without which moderate to large pieces of software can fall into chaos. The basic idea of data abstraction is to divide the conceptual portions of a piece of software into independent domains, separated by abstraction barriers, so that we may work on any one of the domains without knowing the internal details of the others. This allows a team to divide the work on a large project among the different team members. It also allows an individual programmer to concentrate on one portion of a project at a time. For practical purposes, when dealing with complex piles of information such as programs, a single person on subsequent days of the week is a lot like a bunch of different people.
Last year's assignment #4 exercised abstraction barriers in a slightly more sophisticated fashion than this year's assignment.
Look at the slightly censored code of my pretty bad prototype of the Perfectly Accurate Calculator. Make sure you are looking at the new version 1,mod1.2. I changed quite a bit from the version 1,mod1.1 that I posted with Assignment #3. As well as censoring some portions that I want you to figure out, I commented out some of my partial code, and replaced it with trivial function definitions to get off the ground.
Don't try to understand all of the code. But notice that the code has
two key sections: Model, and View, with an
abstraction barrier between them. The barrier is given by the list of
functions starting with M-
, and the list of functions
starting with V-
:
'+
, '-
, '*
,
'/
, 'expt
, 'fib
M-empty_formula
M-number->formula
M-apply_operator_to_formulas
M-number_formula?
M-formula_number
M-formula_operator
M-formula_argument_list
M-evaluate_formula
M-number->value
M-number_value?
M-value_number
M-set_input_formula
M-input_formula
M-set_output_formula
M-output_formula
V-display_input_formula
V-display_output_formula
The Model component of the PAC is responsible for all of
the manipulations of the basic concepts that the PAC works
with---numerical values, numerical formulae, and states of operation
with those. The View component is responsible for all of the
interaction with a user. Definitions outside of the Model
component (particularly, those in the View component) may
only interact with a formula or model state through those functions
from the Model component beginning with
M-
. Similarly, definitions outside of the View
component may interact with the user interface through those functions
from the View component beginning with V-
. These
rules establish a sort of abstraction barrier between the
definitions inside each main component, and the rest of the PAC
definitions.
Data abstraction is one of the key ideas in object-oriented programming. Some programming languages have specific constructs to indicate and enforce abstraction barriers. Abstraction barriers in Scheme are essentially voluntary. There are extra Scheme libraries that allow us to define and enforce abstraction barriers, but we will stick to the voluntary approach for now.
Within the Model component, there are subdivisions into the formula datatype, the value datatype, and the Model state. The formula and value datatypes have no need to use functions defined in the Model state subcomponent. The Model state subcomponent may only use the listed interface functions from the formula and value datatypes. It seemed excessive to add another mark for those functions, but the abstraction barriers around the two subcomponents are also important to keep the PAC code well organized.
The value datatype is so trivial that it's a bit annoying. But it will become much more sophisticated in a few weeks when we implement exact real values. By carrying along the abstraction barrier for the trivial value datatype, we'll be ready to plug in more sophisticated implementations of values as we cook them up.
I find that the proliferation of names in code like this makes it
impossible to be either perfectly systematic, or enjoyably readable,
much less both. I try to pick a middle course that's not too horrific,
and to mostly go toward systematic. So I marked all the interface
functions for the two major components, Model and View with
M-
and V-
, respectively. But I didn't mark
the interface components for the datatypes within the model. In the
abstract datatypes, I gave constructors and other operations names
beginning with the name of the datatype, I gave tests names
ending in a question mark, and I gave selectors names
ending in the name of the datatype. That's one of several
conventions for naming functions in an abstract datatype. None of them
seems to be really enjoyable. This one makes your eyes fuzz over
distinguishing number->formula
from
formula_number
, but we'll just squint and bear it.
Every abstraction barrier is an organizational aid, but also a constraint on further development of software (the two qualities are inextricably connected). As long as further development fits the constraints imposed by our abstraction barriers, the barriers simplify our thinking and help avoid lots of mistakes. Occasionally, we must change the barriers. That is a much more costly step than working within a single component. It may require changes to all software components that use the functions in the barrier that we change.
The key to testing an abstract datatype is to test the correct relationship between constructors and selectors and the other functions if any. You may look directly at the list representations to get an idea what's up. But as the datatypes become more sophisticated, that may be difficult or impossible, and you need test suites based on the relationships between functions.
Run the PAC, and just notice what the buttons do.
A final product probably won't have a button to incorporate the
input. That will more likely be done automatically on entry. Probably
either the concrete syntax as entered by the user will be the only
input display, or the display of abstract syntax will be substantially
different, perhaps a graphical display instead of textual. The current
layout makes it easier to observe what's going on. The current version
shows the abstract syntax in a more compact form that you see in the
DrScheme interaction window or test suite window. Instead of
(list '+ 2 3)
you just see (+ 2 3)
.
From now on, you should routinely save every meaningful intermediate step in your work. I will stop mentioning that after one more time.
Replace my incredibly stupid parser, which just assumes that the only possible operation is addition, with your much better recursive descent parser. If you did sufficient testing in Assignment #3, there's not much point in running a test suite for this step. Observe the behavior of the PAC GUI with your new parser, as a sanity check that you put your parser in correctly. But the GUI is not good for much serious testing. Most serious testing works much better with direct calls to the defined functions. Since you haven't changed those, there's nothing in particular to test.
Some of the structure of the solution is already there, in commented sections. You can remove the comment marks, and then substitute in your code, or just use the commented parts as guideposts.
With your improved parser, the PAC GUI can evaluate any formula
with addition, subtraction, multiplication, division, and
exponentiation. But it can't deal with the Fibonacci function. That's
not the fault of your parser. It's the fault of the evaluator, which
has no interpretation of 'fib
. This sort of skew between
components is part of the reason that you should test through internal
function calls, rather than through the GUI.
Your work in step 1 violates the abstraction barrier around the
formula datatype. When you construct an abstract syntax in the form
(list <operator> <operand> <operand>)
you use internal knowledge of the representation of a formula, which
will change in later steps.
Fix your parser code to respect the abstraction barrier, by having
it use the functions M-apply_operator_to_formulas
, etc.,
instead of list
. Rerun your test suite from Assignment #3
on the new code. It should produce exactly the same results. You
should catch any little typos you made replacing the direct
construction of lists with the abstract formula constructors.
Even though this step is conceptually trivial, I did it one
parse_<nt>
function at a time, testing in
between.
Make sure that you saved your work from step 2. We will back out of step 3, and pick up the step 2 version in step 4.
As a quick exercise to demonstrate the independence of code within the formula datatype and code outside, we'll try an easy variation on the formula datatype. Instead of representing the application of a binary operator to its arguments in the form
(list <operator> <operand1> <operand2>)
use the alternate form
(cons (cons <operator> <operand1>) <operand2>)
Similarly, the three-argument operator fib
applies in
the form:
(cons (cons (cons <operator> <operand1>) <operand2>) <operand3>)
These forms are not well-formed Scheme lists, but they use the bottom level implementation a tiny bit more efficiently. They actually resemble the structure of the notation for function application in ML and Haskell.
Some of the constructors and selectors become recursive in this representation.
I corrected the following paragraph on 27 October.
You should be able to switch back and forth between the two representations of the formula datatype with no observable change in the behavior of the parser. The evaluator will fail, because it does not respect the abstraction barrier. If you have the energy, you can try the switch again with your evaluator in play.
Back up to your version from step 2, which you saved. Replace my
definition of M-evaluate_formula
with your own recursive
definition of the evaluation of a formula (in abstract syntax form)
involving, addition, multiplication, subtraction, division,
exponentiation, and the Fibonacci function. Use your best code for
fib
from Assignment #1.
My old definition of M-evaluate_formula
is tiny, but
it's still a tiny bit interesting. (eval fmla_to_eval)
actually calls Scheme's very own evaluator, the same one that
is evaluating all of our work, including the PAC GUI as a whole.
The availability of the whole programming language within one of its own function calls is a very interesting property of Scheme. Back in the old days, a lot of people found this notion of a program invoking the interpreter for its own programming language almost mystical. They called the LISP and Scheme implementations metacircular interpreters. I think that's a tad pretentious, but the feature is rather nice in a number of ways, both practical and conceptual.
The real reason that self-invocation is hard, and not usually
available in other programming languages, has more to do with their
complicated syntax than it does with the sophistication of their
evaluation mechanisms. It's mainly the very simple syntax of
LISP/Scheme that makes it relatively easy to
incorporate the eval
function.
The availability of eval
let me create a prototype
calculator with a fairly powerful evaluator in a few keystrokes. It's
a very big help at the beginning of a project to have a lot of the
work already done for you. So why are we replacing my
M-evaluate_formula
based on Scheme's
eval
? The only extra power that we are adding is the
power to evaluate applications of the Fibonacci function. Although
fib
isn't predefined in Scheme, I'm sure I could
have found a way to invoke Scheme's eval
with
some additional definitions. There are two remaining reasons to
replace the easy (because somebody else did the work)
eval
implementation:
eval
---it might even complicate the
structure of our solution.Even when we replace them later, it is a lot easier to construct
software when we have powerful starting points, such as
eval
, written by other people that allow us to get the
structure of our code right before we have to fill in all of the
details.
Rewrite the evaluator using my higher-order generic evaluator,
which you'll find hidden in commented sections of the code. I also
sketched the structure of the solution. Uncomment my stuff, and fill
in where I left ellipses (...
).
Although the whole computational structure of the solution is
already there in my generic_evaluate_formula
, I expect
that many of you will find it harder to fill in the right arguments to
generic_evaluate_formula
for this step than to write your
own recursive code in the previous step. As you get the hang of
higher-order programming, the relative difficulty will shift in favor
of using generic functions.
The reorganization of evaluation to use the generic evaluator also condenses the information about the formula and value datatypes and the association of constants and operators with their meanings, so that it will be easier to update things later on.
One little aspect of the definition of
M-evaluate_formula
didn't come out the way I wanted it
to. In principle, we should have written a definition of the form
(define M-evaluate_formula ... ((generic_evaluate_formula ... ) ...) )
instead of
(define (M-evaluate_formula fmla) ... (((generic_evaluate_formula ... ) ...) fmla) )
The mention of the argument fmla
is completely
redundant, and disguises the structure of the higher-order definition
a bit. I got stuck with this form because of a syntactic glitch in
Scheme that I consider to be a design error. Scheme
allows local define
s at the beginning of a definition
with function arguments, but not at the beginning of a definition with
no function arguments. So, to use the first form of definition, I
would have to change the style of presenting the local definitions
following (define (M-evaluate_formula fmla) ...
, which I
judged to be too confusing at this stage.
In assignments 1-3, if you followed the given structure pretty
closely, and your function definitions computed the right results, you
were probably doing very well. In this assignment, respecting the
abstraction barriers may be more of a problem than getting the right
results. Since Scheme does not enforce abstraction barriers
formally, it is easy to accidentally refer to a specific
implementation where you shouldn't, and still have a working
program. The error bites you later when you change the
implementation. For full credit, you must respect the abstraction
barriers. Read your code carefully to ensure that you do. The trivial
functions, such as M-number->value
, are the easiest to
forget. They might not be trivial in a later version.
You may be able to catch some breaches of the abstraction barrier by repeating the substitution of a different representation of formulas from step 3.
Hand in the following:
pac.scm
containing the final versions
of all the function definitions after step 5 above.pac-tst.scm
containing a test suite
demonstrating the behavior of the functions you defined.pac_varfmla.scm
containing a PAC using
your variant version of the formula datatype from step 3. If you
repeated the variation at a later step, you may hand in that version.pac_varfmla-tst.scm
containing a test
suite demonstrating your variant of the formula datatype. If you test
the formula functions thoroughly, you don't need to test anything on
the other side of the abstraction barrier.comments.txt
with your comments.If you do not finish, you will get the best partial credit by handing in a complete working result for step 1, 2, or 4 with appropriate test suite. There will be much less partial credit for a dysfunctional and/or untested attempt at a step.
![]() | |