Reduce
Can we improve upon the final summation phase of the trapezoidal program?
Certainly we can reverse the order of communications in our hand-rolled tree broadcast:
stage 0: process 4 sends to 0
stage 1: process 2 sends to 0
stage 2: process 1 sends to 0