Courses", $CS_COURSES_ROOT_URL); ?>

IMPORTANT NOTE: The following are the contents of Ben Singer's Project Step 4 submission from last fall. I assembled his comments and graphs into a web page, though this format is not required, nor expected. Use this as a guide to direct your work, but this year's step 4 specifications don't require graphs to be made, and much of the code used/written was different. [Aren 11/25/2003]

Ben Singer's 2002 Project Step 4 Analysis

1) My iterative precfunc multiplication vs. your non-iterative precfunc multiplication

At first, I thought that my iterative code would be much slower than your non-iterative code. After all, you calculate the precision required from the two xreals to be multiplied directly from the requested precision - I do an iterative search for it, dividing the test precision by 2 each iteration until the product is smaller than the requested precision. Slow and inefficient, right? Well, yes, but the results of the comparison still surprised me:

A requested precision of 1/10 was compared with a requested precision of 1/100 for both implementations first - not surprisingly, the higher precision took slightly longer to calculate for both implementations, but not significantly. This occurs for different reasons in each implementation. In mine, my first test precision is the requested precision - hence, as the requested precision gets smaller, so does where the iteration starts. For yours, you calculate directly from the requested precision - hence changing the precision should have neglible effect on the time - and this is what we find.

The real surprise is in the comparison of my implementation to yours:

Your non-iterative(!!) version appears to be taking on the order of 10 times as much time as my inefficient, iterative version! Is my version just that good? No - the times for my version are lower, but not good. Clearly, something akin to the stream-map problem was happening here. Analysis of your code revealed the error to be in the following lines:



(let ((xr1size (max (abs (min_ratintvl->rational (xr1 prec)))
                        (abs (max_ratintvl->rational (xr1 prec)))
                        prec))

          (xr2size (max (abs (min_ratintvl->rational (xr2 prec)))
                        (abs (max_ratintvl->rational (xr2 prec)))
                        prec)))

Note that for both xr1size and xr2size (xr1 prec) and (xr2 prec) is calculated twice. For a single non-chained multiplication of, say, two rationals this is no big deal. However, when in a long chain, the repetition produces a huge tree structure, with masses of duplicated calculation. Was this the problem? New code was constructed:


(let ((xr1int (xr1 prec))
      (xr2int (xr2 prec)))

        (let ((xr1size (max (abs (min_ratintvl->rational xr1int))
                            (abs (max_ratintvl->rational xr1int))
                            prec))

              (xr2size (max (abs (min_ratintvl->rational xr2int))
                            (abs (max_ratintvl->rational xr2int))
                            prec)))

And the same tests were run:

Looks better. Same structure, but the magnitude has gone way down. Again, as expected, changing the precision has negligble effects on the time. Comparing this new version to my iterative version:

2) My tristream vs iterative precfunc (mine) and non-iterative (the updated version of yours)

Again, first tristream was compared against itself with different precisions:

No surprises here; the reason the tristream times are not linear is compress_trinary expansion. Consider requesting 1 trit of output. The stream that generates that is still compress-ed - and compress looks at the 2nd digit to determine whether it needs to change the first digit at all. Now consider this propagated up a chain - each compress produces a need to look at one more digit in the previous result, which looks at one more digit in the previous results, which looks at...until the chain bottoms out (the spark of inspiration on this came from one of the Hans Boehm papers). Requesting one more digit of precision from the output doesn't change that basic idea; it justs builds the addition of one more digit at each level of a base of 4 instead of 3 - so, no surprises here. Now we compare the iterative precfunc to tristream:

The precfunc implementation appears to be leading for small chains of multiplication, but takes off after 8-9 multiplications. Even tristream at 20 multiplications is more efficient than iterative-precfunc at 12 multiplications. Given the inefficieny of iteration, not too surprising.

The non-iterative precfunc vs tristream:

Now precfunc leads tristream (at least for the range of data considered). However, its growth appears to be much faster, when it takes off around 9 multiplications, than the constantly but slowly increasing growth of tristream.

One other thing to notice: tristream data stops at 20 multiplications because it aborts due to garbage collection. precfunc data stops at 12 multiplications because of the long times making collecting more data prohibitive - while precfunc may run into a storage limit somewhere, that is not the immediate bound on its usefulness. Given more memory, tristream should be able to continue its slow, steady growth, while precfunc will continue to take longer and longer.

Last modified: Sat Nov 24 11:19:01 CDT 2003